An OpenAI exec bragged that ChatGPT had discovered a mathematical breakthrough — and was immediately left with egg on his face when it turned out to be bogus, The Decoder reports.
We begin where any good controversy begins: with a now-deleted tweet. This one was by Kevin Weil, a vice president at OpenAI who last week proclaimed that the company’s newest large language model, GPT-5, had “found solutions to 10 (!) previously unsolved Erdős problems and made progress on 11 others.”
The Erdős problems refer to a number of tricky mathematical conjectures made by the Hungarian mathematician of the same name. If the claims are true, this would be a notable achievement for an AI, and an awesome demonstration of the ChatGPT’s purported “PhD level” intelligence.
But the mathematician who literally runs the website erdosproblems.com, a researcher at the University of Manchester named Thomas Bloom, tweeted that this was “a dramatic misrepresentation,” because the AI had only found existing work that already solved the problems.
In fact, the problems weren’t even “unsolved” in the first place. They were listed as “open” on the Erdős problems website, which Bloom stressed only meant that he personally was unaware of a paper with a solution, not that no solution exists.
“No need to describe it as something it’s not!” Bloom wrote.
OpenAI’s competitors blasted Weil for the blunder, per TechCrunch. Google DeepMind CEO Demis Hassabis called it “embarrassing.” Hyperbolic Labs cofounder Yuchen Jin said that the episode calls for “better peer review for these ‘AI discovers science/math’ claims.” And the notoriously blunt Yann LeCun, Meta’s chief AI scientist, summed it up with a brutal joke: “Hoisted by their own GPTards.”
It’s another example of the careless boosterism of OpenAI and the tech industry at large. To hype the launch of GPT-5, OpenAI repeated the claim that the AI had achieved “PhD-level intelligence,” even though it’s still often incapable of giving a straight answer to basic questions. CEO Sam Altman warns-slash-hypes that the tech is getting good enough to replace entire categories of jobs, while also declaring that the company is on the verge of building an artificial general intelligence, or AGI, that roundly outperforms humans in every domain.
Science and mathematics in particular is one way that AI companies are trying to hijack some credibility. Maybe AI isn’t perfect yet, but who can argue with the endeavor if widely available chatbots are pushing the boundaries of empirical pursuits? Elon Musk, for instance, claims that his “maximum truth-seeking” chatbot Grok will discover “new technologies” and “new physics,” if not the “true nature of the universe.”
While some scientists have found uses for generative AI, it’s largely as a research tool, dredging up the scientific literature it devoured in its training that may otherwise get buried in the results of a search engine. But anytime someone from the industry claims that chatbots are coming up with breakthroughs on their own should be taken with a grain of salt. In this case, all ChatGPT seems to have done is copy someone else’s homework.
More on OpenAI: Study Finds GPT-5 Is Actually Worse Than GPT-4o, New Research Finds