uzu
A high-performance inference engine for AI models on Apple Silicon. Key features:
Simple, high-level API
Hybrid architecture, where layers can be computed as GPU kernels or via MPSGraph (a low-level API beneath CoreML with ANE access)
Unified model configurations, making it easy to add support for new models
Traceable computations to ensure correctness against the source-of-truth implementation
Utilizes unified memory on Apple devices
Quick Start
First, add the uzu dependency to your Cargo.toml :
[ dependencies ] uzu = { git = " https://github.com/trymirai/uzu " , branch = " main " , package = " uzu " }
Then, create an inference Session with a specific model and configuration:
use std :: path :: PathBuf ; use uzu :: { backends :: metal :: sampling_config :: SamplingConfig , session :: { session :: Session , session_config :: SessionConfig , session_input :: SessionInput , session_output :: SessionOutput , session_run_config :: SessionRunConfig , } , } ; fn main ( ) -> Result < ( ) , Box < dyn std :: error :: Error > > { let model_path = PathBuf :: from ( "MODEL_PATH" ) ; let mut session = Session :: new ( model_path . clone ( ) ) ? ; session . load_with_session_config ( SessionConfig :: default ( ) ) ? ; let input = SessionInput :: Text ( "Tell about London" . to_string ( ) ) ; let tokens_limit = 128 ; let run_config = SessionRunConfig :: new_with_sampling_config ( tokens_limit , Some ( SamplingConfig :: default ( ) ) ) ; let output = session . run ( input , run_config , Some ( |_ : SessionOutput | { return true ; } ) ) ; println ! ( "{}" , output . text ) ; Ok ( ( ) ) }
Overview
For a detailed explanation of the architecture, please refer to the documentation.
uzu uses its own model format. To export a specific model, use lalamo. First, get the list of supported models:
uv run lalamo list-models
Then, export the specific one:
uv run lalamo convert meta-llama/Llama-3.2-1B-Instruct --precision float16
Alternatively, you can download a preprepared model using the sample script:
./scripts/download_test_model.sh $MODEL_PATH
CLI
You can run uzu in a CLI mode:
cargo run --release -p cli -- help
Usage: uzu_cli [COMMAND] Commands: run Run a model with the specified path serve Start a server with the specified model path help Print this message or the help of the given subcommand(s)
Bindings
uzu-swift - a prebuilt Swift framework, ready to use with SPM
License
This project is licensed under the MIT License. See the LICENSE file for details.