Google rolls out Gemini Deep Think AI, a reasoning model that tests multiple ideas in parallel

Google DeepMind is rolling out Gemini 2.5 Deep Think, which, the company says, is its most advanced AI reasoning model, able to answer questions by exploring and considering multiple ideas simultaneously and then using those outputs to choose the best answer.

Subscribers to Google’s $250-per-month Ultra subscription will gain access to Gemini 2.5 Deep Think in the Gemini app starting Friday.

First unveiled in May at Google I/O 2025, Gemini 2.5 Deep Think is Google’s first publicly available multi-agent model. These systems spawn AI multiple agents to tackle a question in parallel, a process that uses significantly more computational resources than a single agent, but tends to result in better answers.

Google used a variation of Gemini 2.5 Deep Think to score a gold medal at this year’s International Math Olympiad (IMO).

Alongside Gemini 2.5 Deep Think, the company says it is releasing the model it used at the IMO to a select group of mathematicians and academics. Google says this AI model “takes hours to reason,” instead of seconds or minutes like most consumer-facing AI models. The company hopes the IMO model will enhance research efforts, and aims to get feedback on how to improve the multi-agent system for academic use cases.

Google notes that the Gemini 2.5 Deep Think model is a significant improvement over what it announced at I/O. The company also claims to have developed “novel reinforcement learning techniques” to encourage Gemini 2.5 Deep Think to make better use of its reasoning paths.

“Deep Think can help people tackle problems that require creativity, strategic planning and making improvements step-by-step,” said Google in a blog post shared with TechCrunch.

Techcrunch event Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise. Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise. San Francisco | REGISTER NOW

The company says Gemini 2.5 Deep Think achieves state-of-the-art performance on Humanity’s Last Exam (HLE) — a challenging test measuring AI’s ability to answer thousands of crowdsourced questions across math, humanities, and science. Google claims its model scored 34.8% on HLE (without tools), compared to xAI’s Grok 4, which scored 25.4%, and OpenAI’s o3, which scored 20.3%.

Google also says Gemini 2.5 Deep Think outperforms AI models from OpenAI, xAI, and Anthropic on LiveCodeBench6, a challenging test of competitive coding tasks. Google’s model scored 87.6%, whereas Grok 4 scored 79%, and OpenAI’s o3 scored 72%.

... continue reading