Researchers who use hallucinated references to face arXiv ban

The preprint server arXiv’s policy on the use of generative AI by authors has drawn a slew of positive and negative comments from the community.Credit: Thomas Fuller/SOPA Images/LightRocket/Getty

The physical-sciences repository arXiv is banning researchers from posting their manuscripts on the platform for one year if a submission is found to contain references that have been hallucinated by artificial-intelligence tools. The ban also applies to authors who submit manuscripts containing other “incontrovertible” signs of generative AI usage that demonstrate the AI results haven’t been carefully checked.

What’s more, after a researcher’s one-year penalty is over, they will not be able to post any manuscripts to arXiv unless the work has already been accepted at a “reputable peer-reviewed venue”, according to Thomas Dietterich, a computer scientist at Oregon State University in Corvallis and chair of arXiv’s computer science section.

AI content is tainting preprints: how moderators are fighting back

ArXiv’s new policy, which has triggered a torrent of both positive and negative comments from researchers on social media, is one of the latest and most far-reaching examples of how preprint servers are grappling with the rising tide of AI ‘slop’ — low-quality or meaningless content made using generative AI. Some, such as arXiv, are imposing bans on authors who do not follow their guidelines. Others have ruled out entire categories of submissions that raise concerns about generative AI use.

Scientists increasingly use large language models (LLMs) for a variety of legitimate tasks, such as literature reviews, but arXiv’s announcement drew approval from many researchers. “Great move and I fully support it! The only question I have is: why only AI hallucinations, folks? Let’s fight the slop in general”, Valeri Kremnev, co-founder of the AI startup sci2sci in Berlin, posted on social media.

But not everyone is convinced that such measures are the right approach. Natalie Khalil, the founder of Reviewer 3, a platform run from in San Francisco, California, that uses AI to help researchers to conduct peer review, argues that arXiv is treating the symptom, not the root cause. “If a researcher is banned from arXiv, they will still do research, just elsewhere,” she notes.

In response, Dietterich says that various platforms need to work together to cull faulty references and other questionable output from LLMs. “The fact that an irresponsible researcher can publish irresponsible research elsewhere is not a justification for allowing them to post it on arXiv.”

Too much trust

In Dietterich’s announcement on social media, he wrote that arXiv “can’t trust anything” in a submission that contains strong evidence “that the authors did not check the results of LLM generation”. This includes hallucinated references and LLM comments such as “here is a 200-word summary; would you like me to make any changes?”

... continue reading