OpenAI wins gold at prestigious math competition - why that matters more than you think

OpenAI

OpenAI has achieved a new milestone in the race to build AI models that can reason their way through complex math problems.

On Saturday, the company announced that one of its models achieved gold medal-level performance on the International Math Olympiad (IMO), widely regarded as the most prestigious and difficult math competition in the world.

Critically, the winning model wasn't designed specifically to solve IMO problems, in the way that earlier systems like DeepMind's AlphaGo -- which famously beat the world's leading Go player in 2016 -- were trained on a massive dataset within a very narrow, task-specific domain. Rather, the winner was a general-purpose reasoning model, designed to think through problems methodically using natural language.

Also: Is ChatGPT down? You're not alone. Here's what OpenAI is saying

"This is an LLM doing math and not a specific formal math system," OpenAI wrote in its X post. "It's part of our main push towards general intelligence."

(Disclosure: Ziff Davis, ZDNET's parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems. Ziff Davis also owns DownDetector.)

Not much is known at this point about the identity of the model that was used. Alexander Wei, a researcher at OpenAI who led the IMO research, called it "an experimental reasoning LLM" in an X post, which included an illustration of a strawberry wreathed in a gold medal, suggesting it's built atop the company's o1 family of reasoning models, which debuted in September.

"To be clear: We're releasing GPT-5 soon, but the model we used at IMO is a separate experimental model," OpenAI added on X. "It uses new research techniques that will show up in future models -- but we don't plan to release a model with this level of capability for many months."

Google DeepMind also announced Monday that "an advanced version of Gemini Deep Think"--a reasoning mode for Gemini 2.5 Pro that debuted in May--also achieved gold medal-level performance at the 2025 IMO, earning the same score as that reported by OpenAI. Thang Luong, the AI researcher at Google DeepMind that led that company's IMO research, contested OpenAI's results on Monday, claiming that according to the IMO's internal marking guide, the latter company's model falls just short of winning a gold medal.

... continue reading