Scientific papers rely on readers trusting their information. That's why it's disturbing that a new study by researchers connected with Cornell and UCLA found 146,900 AI-generated fake citations in scientific papers hosted across four major research databases.
A key limitation of large language models such as Gemini and ChatGPT is their tendency to produce plausible-sounding but incorrect information, a phenomenon known as hallucination. If a researcher relies on a chatbot to draft citations without verifying them, the model may generate references that are entirely fabricated.
While scientific papers are often hidden from the public eye, the research they report has a profound impact on our lives. Everything from the internet to lithium-ion batteries began as a research paper.
But when scientists submit papers that cite AI hallucinations, it can erode faith in the quality of the research.
Sloppy science
The research team analyzed 111 million references from 2.5 million scientific papers. They looked for citations with titles that the team could not match to any publication. While some of these instances were just spelling errors, the team also found hallucinations.
Unscrupulous researchers had faked citations long before the rise of chatbots, so the team also examined the rates of unmatched citations in research published before 2023, when chatbots hadn't yet become ubiquitous.
"We find a sharp rise in non-existent references following widespread LLM adoption," the authors write in the paper.
The team also found that the bad citations were spread across many papers rather than concentrated in just a few. That suggests the problem is widespread, with many researchers relying on AI-generated references without fully verifying them.
Warning signs
... continue reading