Hallucinated citations are polluting the scientific literature. What can be done?

Earlier this year, computer scientist Guillaume Cabanac received a notification from Google Scholar that one of his publications had been cited in a paper published in the International Dental Journal1. That was unexpected, because his research on spotting fabricated papers doesn’t typically intersect with dentistry. “I was very surprised to see that I couldn’t recognize my own reference,” says Cabanac, who is based at the University of Toulouse in France.

The title in the citation resembled that of a preprint2 he had posted in 2021 and never published formally, but the journal was listed as Nature and the DOI — the unique identifier assigned by publishers and preprint repositories — did not lead to the original preprint. “I got very concerned,” adds Cabanac, who immediately suspected that the citation had been hallucinated by artificial intelligence.

This is just one example of a rapidly growing problem. Surveys and related studies have shown that researchers are increasingly using large language models (LLMs) to help to conduct literature searches, write manuscripts and format bibliographies. And sometimes, these models generate non-existent academic references.

Is AI leading to a reproducibility crisis in science?

Over the past year, efforts have begun turning up such hallucinated citations in the literature. One analysis of nearly 18,000 papers accepted by three computer-science conferences found a sharp increase in references that cannot be traced to actual scholarly publications3. The results, reported in January, indicated that 2.6% of papers in 2025 had a least one potentially hallucinated citation — up from about 0.3% in 2024. Another analysis, released in February, estimated that 2–6% of papers in four other 2025 computer-science conferences included references with rephrased titles or citations of publications that the authors couldn’t verify by searching through databases and journal archives4.

And although the scale of the problem remains uncertain, it’s clear that not only conferences are affected. An exclusive analysis conducted by Nature’s news team, in collaboration with Grounded AI, a company based in Stevenage, UK, suggests that at least tens of thousands of 2025 publications, including journal papers and books, as well as conference proceedings, probably contain invalid references generated by AI.

Grounded AI is among the companies offering publishers tools for screening submissions for problematic references. Several publishers told Nature reporters that they have been exploring such tools or developing in-house versions.

But some researchers are concerned that the problem will soon get out of hand. “We’re going to see a flood of fake references,” says Alison Johnston, a political scientist at Oregon State University in Corvallis.

Another issue is deciding what to do about hallucinated citations that make it into the published literature. That’s a problem that academic publishers are wrestling with right now.

Sources of error

... continue reading