Skip to content
Tech News
← Back to articles

Google’s Android coding tests reveal an unexpected Gemini 3.5 Flash weakness

read original more articles
Why This Matters

Google's latest Android Bench results reveal that the Gemini 3.5 Flash AI model, despite its premium price, underperforms compared to older models and competitors like OpenAI's GPT 5.5. This unexpected performance discrepancy highlights challenges in AI model optimization and raises questions about value for money in AI development tools. For consumers and developers, this underscores the importance of evaluating AI models beyond branding and pricing, focusing on actual performance and efficiency.

Key Takeaways

Joe Maring / Android Authority

TL;DR Google’s Android Bench results show Gemini 3.5 Flash trailing older models despite its premium positioning.

Gemini 3.5 Flash missed the top five, while OpenAI’s GPT 5.5 claimed first place and Gemini 3.1 Pro Preview outperformed its successor.

Google’s newest Flash model scored 63.7 and became the most expensive option in the rankings, averaging $147.1 per run.

Google has just refreshed its Android Bench rankings, and the results present developers with a puzzling picture. Google’s new Gemini 3.5 Flash is actively falling behind its predecessor while charging you three times the price to use it.

The latest Android coding leaderboard, a benchmark that evaluates how well different AI models can perform Android development tasks, introduced Gemini 3.5 Flash for the first time, but the newcomer didn’t make it into the top five. Topping the list was OpenAI’s GPT 5.5, which scored 74, followed by GPT 5.4 and an older Google model, Gemini 3.1 Pro Preview, both with 72.4. The new Claude Opus models also outperformed the Flash variant.

Don’t want to miss the best from Android Authority? Set us as a favorite source in Google Discover to never miss our latest exclusive reports, expert analysis, and much more.

to never miss our latest exclusive reports, expert analysis, and much more. You can also set us as a preferred source in Google Search by clicking the button below.

Gemini 3.5 Flash scored 63.7, placing sixth overall. What was more surprising, though, was its efficiency. The model averaged 355.9 total tokens, a big jump compared to other systems, according to Google’s benchmark data. That came to an average cost of $147.1, making it the most expensive model on the entire list even with slower performance than a number of rivals.

For context, Google’s Flash branding has always been about speed and cheaper prices. At Google I/O 2026, the company announced the most powerful Flash model it had ever built, Gemini 3.5 Flash, which it claimed had more robust coding capabilities and better support for AI agents and complex workflows. Google also said the model outperformed Gemini 3.1 Pro in a number of internal benchmarks and produced output up to four times faster than competing frontier models.

... continue reading