Tech News
← Back to articles

LLMs’ impact on science: Booming publications, stagnating quality

read original related products more articles

There have been a number of high-profile cases where scientific papers have had to be retracted because they were filled with AI-generated slop—the most recent coming just two weeks ago. These instances raise serious questions about the quality of peer review in some journals—how could anyone let a figure with terms like “runctitional,” “fexcectorn,” and “frymblal” through, especially given the ‘m’ in frymblal has an extra hump? But it has not been clear whether these high-profile examples are representative. How significantly has AI use been influencing the scientific literature?

A collaboration of researchers at Berkeley and Cornell have decided to take a look. They’ve scanned three of the largest archives of pre-publication papers and identified ones that are likely to have been produced using Large Language Models. And they found that, while researchers produce far more papers after starting to use AI and the quality of the language used went up, the publication rate of these papers has dropped.

Searching the archives

The researchers began by obtaining the abstracts of everything placed in three major pre-publication archives between 2018 and mid-2024. At the arXiv, this netted them 1.2 million documents; another 675,000 were found in the Social Science Research Network; and bioRxiv provided another 220,000. So, this was both a lot of material to work with and covered a lot of different fields of research. It also included documents that were submitted before Large Language Models were likely to be able to produce output that would be deemed acceptable.

The researchers took the abstracts from the pre-ChatGPT period and trained a model to recognize the statistics of human-generated text. Those same abstracts were then fed into GPT 3.5, which rewrote them, and the same process was repeated. The model could then be used to estimate whether a given abstract was likely to have been produced by an AI or an actual human.

The research team then used this to identify a key transition point: when a given author at one of these archives first started using an LLM to produce a submission. They then compared the researchers’ prior productivity to what happened once they turned to AI. “LLM adoption is associated with a large increase in researchers’ scientific output in all three preprint repositories,” they conclude.