Gemma Multimodal Fine-Tuner
Fine-tune Gemma on text, images, and audio β on your Mac, on data that doesn't fit on your Mac.
πΌοΈ Image + text LoRA β captioning and VQA on local CSV.
β captioning and VQA on local CSV. ποΈ Audio + text LoRA β the only Apple-Silicon-native path that does this.
β the only Apple-Silicon-native path that does this. π Text-only LoRA β instruction or completion on CSV.
β instruction or completion on CSV. βοΈ Stream from GCS / BigQuery β train on terabytes without filling your SSD.
β train on terabytes without filling your SSD. π Runs on Apple Silicon β MPS-native, no NVIDIA box required.
Source: github.com/mattmireles/gemma-tuner-multimodal (public).
LoRA for Gemma 4 & 3n β why not just useβ¦?
This MLX-LM Unsloth axolotl Fine-tune Gemma (text-only CSV) β β β β Fine-tune Gemma image + text (caption / VQA CSV) β β οΈ varies β οΈ varies β οΈ varies Fine-tune Gemma audio + text β β β β οΈ CUDA only Runs on Apple Silicon (MPS) β β β β Stream training data from cloud β β β β οΈ partial No NVIDIA GPU required β β β β
... continue reading