OpenAI's most capable models hallucinate more than earlier ones
Published on: 2025-08-21 01:15:44
Adrienne Bresnahan/Getty Images
OpenAI says its latest models, o3 and o4-mini, are its most powerful yet. However, research shows the models also hallucinate more -- at least twice as much as earlier models.
Also: How to use ChatGPT: A beginner's guide to the most popular AI chatbot
In the system card, a report that accompanies each new AI model, and published with the release last week, OpenAI reported that o4-mini is less accurate and hallucinates more than both o1 and o3. Using PersonQA, an internal test based on publicly available information, the company found o4-mini hallucinated in 48% of responses, which is three times o1's rate.
While o4-mini is smaller, cheaper, and faster than o3, and, therefore, wasn't expected to outperform it, o3 still hallucinated in 33% of responses, or twice the rate of o1. Of the three models, o3 scored the best on accuracy.
Also: OpenAI's o1 lies more than any major AI model. Why that matters
"o3 tends to make more claims overall, leading to mo
... Read full article.