Google’s latest on-device AI model is custom-made for your laptop

TL;DR Google has released the Gemma 4 12B model aimed at consumer laptops with at least 16GB RAM.

Gemma 4 12B is the company’s first mid-sized model to support native audio input.

It utilizes an encoder-free architecture to offer multimodal performance without the latency introduced by encoders. The new model performs close to the Gemma 4 26B MoE model in benchmarks.

Back in April, Google released its mobile-friendly Gemma E2B and E4B models, bringing on-device multimodal AI to Android and iOS devices. It also released the high-end 26B Mixture of Experts (MoE) and 31B Dense models for higher-end devices with dedicated AI GPUs. Now, the company is launching another Gemma model that sits nicely between the four.

Google today announced the Gemma 4 12B model aimed at bringing on-device AI capabilities to laptops. It offers multimodal features and is the first mid-sized model from Google to support native audio input.

Don’t want to miss the best from Android Authority? Set us as a favorite source in Google Discover to never miss our latest exclusive reports, expert analysis, and much more.

to never miss our latest exclusive reports, expert analysis, and much more. You can also set us as a preferred source in Google Search by clicking the button below.

The company claims that its 12B model delivers performance similar to the 26B MoE model in benchmarks, while being small enough to run on normal consumer laptops with 16GB of RAM.

To achieve this, the company came up with unique solutions for supporting multimodal inputs without increasing latency and memory usage. Gemma 4 12B uses an encoder-free architecture to avoid the memory costs associated with encoders that are typically used in most multimodal AI models.

Google

... continue reading