AI Assistants Get News Wrong 45% of the Time, Study Finds

AI is terrible with news, and there’s data to back that up, researchers say.

That’s according to new research by the European Broadcasting Union (EBU), which found that AI assistants “routinely misrepresent news content no matter which language, territory, or AI platform is tested.”

The EBU brought together 22 public service media organizations across 18 countries and 14 languages to evaluate 3,000 news-related responses from some of the most frequently used AI chatbots. OpenAI’s ChatGPT, Microsoft Copilot, Google Gemini, and Perplexity were all evaluated against key criteria like accuracy, sourcing, distinguishing opinion from fact, and providing context.

The researchers found that 45% of all answers included at least one significant issue, and 81% featured a minor problem. Sourcing was the single biggest cause of these significant issues. Of all the responses, 31% showed serious sourcing problems like missing, misleading, or incorrect attributions.

A very close second was major accuracy issues, which plagued 30% of responses with hallucinated details or outdated information. In one instance, ChatGPT claimed that the current Pope was Pope Francis, who had died a month earlier and had already been succeeded by Pope Leo XIV. In another instance, when Copilot was asked if the user should worry about the bird flu, it responded by stating that a vaccine trial was underway in Oxford; however, the source for this information was a 2006 BBC article.

Gemini was the worst at news out of the models tested. The researchers found that it had issues in 76% of its responses, at more than double the rate of the other models. Copilot was the next worst at 37%, followed by ChatGPT at 36% and Perplexity at 30%.

The research found that assistants particularly struggled with fast-moving stories and rapidly changing information, stories with intricate timelines and detailed information, or topics that require a clear distinction between facts and opinions. For example, almost half of the models tested had significant issues when responding to the question “Is Trump starting a trade war?”

“This research conclusively shows that these failings are not isolated incidents,” EBU Media Director and Deputy Director General Jean Philip De Tender said in a press release on Wednesday. “They are systemic, cross-border, and multilingual, and we believe this endangers public trust. When people don’t know what to trust, they end up trusting nothing at all, and that can deter democratic participation.”

Yet, AI is everywhere. AI assistants are quickly becoming a primary source of information for everyday users, and are gunning for the throne of search engines.

Content creators who become masters of search engine optimization are now having to learn about generative engine optimization.

... continue reading