Ollama is now powered by MLX on Apple Silicon in preview

2026-03-31 | original

read original get MLX for Apple Silicon → more articles

Why This Matters

Ollama's integration with MLX on Apple Silicon significantly boosts performance, enabling faster responses for AI assistants and coding agents. This advancement enhances user experience by reducing latency and improving efficiency, making AI tools more practical for everyday and professional use. It also demonstrates the ongoing synergy between hardware and AI software, pushing the boundaries of what Apple Silicon devices can achieve in AI workloads.

Key Takeaways

Ollama now runs faster on Apple Silicon, leveraging MLX for improved performance.
Supports NVIDIA’s NVFP4 format for higher quality responses with reduced memory use.
Enhanced caching boosts responsiveness for coding and AI agent tasks.

Today, we’re previewing the fastest way to run Ollama on Apple silicon, powered by MLX, Apple’s machine learning framework.

This unlocks new performance to accelerate your most demanding work on macOS:

Personal assistants like OpenClaw

Coding agents like Claude Code, OpenCode, or Codex

Accelerate coding agents like Pi or Claude Code

OpenClaw now responds much faster

Fastest performance on Apple silicon, powered by MLX

Ollama on Apple silicon is now built on top of Apple’s machine learning framework, MLX, to take advantage of its unified memory architecture.

... continue reading

Explore topics: ollama mlx apple silicon nvidia nvfp4