Skip to content
Tech News
← Back to articles

Ollama adopts MLX for faster AI performance on Apple silicon Macs

read original get MLX AI Developer Kit → more articles
Why This Matters

The integration of MLX into Ollama significantly boosts AI model performance on Apple silicon Macs, enabling faster and more efficient local AI processing. This advancement is crucial for developers and power users seeking to run complex AI models locally without relying on cloud services, enhancing privacy and reducing latency. As Apple continues to optimize AI capabilities on its hardware, it paves the way for more robust and accessible local AI applications in the industry.

Key Takeaways

One of the best tools to run AI models locally on a Mac just got even better. Here’s why, and how to run it.

Local AI models now run faster on Ollama on Apple silicon Macs

If you’re not familiar with Ollama, this is a Mac, Linux, and Windows app that lets users run AI models locally on their computers.

Contrary to cloud-based apps such as ChatGPT, whose models don’t run locally and require an internet connection, Ollama lets users load and run models directly on their machines.

These models can be downloaded from open-source communities such as Hugging Face, or even directly from the model provider, as we covered here.

However, running an LLM locally can be quite challenging, as even small and lightweight LLMs tend to gobble up substantial RAM and GPU memory.

To try to counter that, Ollama has released a preview version (Ollama 0.19) of its app that “is now built on top of Apple’s machine learning framework, MLX, to take advantage of its unified memory architecture,” making local AI models run faster on Apple silicon Macs.

Here’s Ollama:

This results in a large speedup of Ollama on all Apple Silicon devices. On Apple’s M5, M5 Pro and M5 Max chips, Ollama leverages the new GPU Neural Accelerators to accelerate both time to first token (TTFT) and generation speed (tokens per second).

With this update, Ollama says it is now faster to run personal assistants such as OpenClaw, as well as coding agents “like Claude Code, OpenCode, or Codex.”

... continue reading