Google launches Gemini 3 with new coding app and record benchmark scores

On Tuesday, Google released Gemini 3, its latest and most advanced foundation model, which is now immediately available through the Gemini app and AI search interface.

Coming just seven months after the Gemini 2.5 release, the new model is Google’s most capable LLM yet, and an immediate contender for the most capable AI tool on the market. The release also comes less than a week after OpenAI released GPT 5.1, and a mere two months after Anthropic released Sonnet 4.5 — a reminder of the blistering pace of frontier model development.

A more research-intensive version of the model, called Gemini 3 Deepthink, will also be made available to Google AI Ultra subscribers in the coming weeks, once it passes further rounds of safety testing.

“With Gemini 3, we’re seeing this massive jump in reasoning,” said Tulsee Doshi, Google’s head of product for the Gemini model. “It’s responding with a level of depth and nuance that we haven’t seen before.”

Some of that reasoning power is already registering on independent benchmarks. With a score of 37.4, the model marked the highest score on record on the Humanity’s Last Exam benchmark, meant to capture general reasoning and expertise. The previous high score, held by GPT-5 Pro, was 31.64. Gemini 3 also topped the leaderboard on LMArena, a human-led benchmark that measures user satisfaction.

According to Google, the Gemini app currently has more than 650 million monthly active users, and 13 million software developers have used the model as part of their workflow.

Alongside the base model, Google also released a Gemini-powered coding interface called Google Antigravity, allowing for multi-pane agentic coding similar to agentic IDEs like Warp or Cursor 2.0. Specifically, Antigravity combines a ChatGPT-style prompt window with a command-line interface and a browser window that can show the impact of the changes made by the coding agent.

Techcrunch event Join the Disrupt 2026 Waitlist Add yourself to the Disrupt 2026 waitlist to be first in line when Early Bird tickets drop. Past Disrupts have brought Google Cloud, Netflix, Microsoft, Box, Phia, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, and Vinod Khosla to the stages — part of 250+ industry leaders driving 200+ sessions built to fuel your growth and sharpen your edge. Plus, meet the hundreds of startups innovating across every sector. Join the Disrupt 2026 Waitlist Add yourself to the Disrupt 2026 waitlist to be first in line when Early Bird tickets drop. Past Disrupts have brought Google Cloud, Netflix, Microsoft, Box, Phia, a16z, ElevenLabs, Wayve, Hugging Face, Elad Gil, and Vinod Khosla to the stages — part of 250+ industry leaders driving 200+ sessions built to fuel your growth and sharpen your edge. Plus, meet the hundreds of startups innovating across every sector. San Francisco | WAITLIST NOW

“The agent can work with your editor, across your terminal, across your browser to make sure that it helps you build that application in the best way possible,” said DeepMind CTO Koray Kavukcuoglu.