Skip to content
Tech News
← Back to articles

Show HN: Gemma 4 Multimodal Fine-Tuner for Apple Silicon

read original get Apple Silicon Compatible GPU β†’ more articles
Why This Matters

Gemma 4 introduces a versatile, Apple Silicon-native multimodal fine-tuning toolkit that enables users to train models on text, images, and audio directly on their Macs, even with large datasets stored in the cloud. This development simplifies the process for developers and researchers by eliminating the need for expensive GPU hardware and extensive data transfers, making advanced machine learning more accessible and efficient on consumer hardware.

Key Takeaways

Gemma Multimodal Fine-Tuner

Fine-tune Gemma on text, images, and audio β€” on your Mac, on data that doesn't fit on your Mac.

πŸ–ΌοΈ Image + text LoRA β€” captioning and VQA on local CSV.

β€” captioning and VQA on local CSV. πŸŽ™οΈ Audio + text LoRA β€” the only Apple-Silicon-native path that does this.

β€” the only Apple-Silicon-native path that does this. πŸ“ Text-only LoRA β€” instruction or completion on CSV.

β€” instruction or completion on CSV. ☁️ Stream from GCS / BigQuery β€” train on terabytes without filling your SSD.

β€” train on terabytes without filling your SSD. 🍎 Runs on Apple Silicon β€” MPS-native, no NVIDIA box required.

Source: github.com/mattmireles/gemma-tuner-multimodal (public).

LoRA for Gemma 4 & 3n β€” why not just use…?

This MLX-LM Unsloth axolotl Fine-tune Gemma (text-only CSV) βœ… βœ… βœ… βœ… Fine-tune Gemma image + text (caption / VQA CSV) βœ… ⚠️ varies ⚠️ varies ⚠️ varies Fine-tune Gemma audio + text βœ… ❌ ❌ ⚠️ CUDA only Runs on Apple Silicon (MPS) βœ… βœ… ❌ ❌ Stream training data from cloud βœ… ❌ ❌ ⚠️ partial No NVIDIA GPU required βœ… βœ… ❌ ❌

... continue reading