Apple Silicon GPU Support in Mojo

The latest nightly releases of Mojo (and our next stable release) include initial support for a new accelerator architecture: Apple Silicon GPUs!

We know that one of the biggest barriers to programming GPUs is access to hardware. It’s our hope that by making it possible to use Mojo to develop for a GPU present in every modern Mac, we can further democratize developing GPU-accelerated algorithms and AI models. This should also enable new paths of local-to-cloud development for AI models and more.

To get started, you need to have an Apple Silicon Mac (we support all M1 - M4 series chips) running macOS 15 or newer, with Xcode 16 or newer installed. The version of the Metal Shading Language we use (3.2, AIR bitcode version 2.7.0) needs the macOS 15 SDK, and you’ll get an error about incompatible bitcode versions if you run on an older macOS or use an older version of Xcode that doesn’t have the macOS 15 SDK.

You can clone our modular repository and try out one of our GPU function examples in the examples/mojo/gpu-functions directory. All but the reduction.mojo example should work on Apple Silicon GPUs today in the latest nightlies. Additionally, puzzles 1-15 of the Mojo GPU puzzles should now work on Apple Silicon GPUs with the latest nightly. We haven’t yet updated the Pixi environment for the GPU puzzles to add Apple Silicon support, so for now you may need to run the Mojo code manually from another environment.

Current capabilities

This is just the beginning of our support for Apple Silicon GPUs, and many pieces of functionality still need to be built out. Known features that don’t work today include:

Intrinsics for many hardware capabilities Not all Mojo GPU examples work, such as reduction.mojo and the more complex matrix multiplication examples GPU puzzles 16 and above need more advanced hardware features

Basic MAX graphs

MAX custom ops

PyTorch interoperability

... continue reading