How to build an AI Scientist: first peer-reviewed paper spills the secrets

AI Scientist, an autonomous research AI tool, produced three research papers, one of which was accepted by peer reviewers at a machine-learning conference. Credit: Olesia Kononenko/iStock via Getty

In August 2024, a team of machine-learning researchers launched the first ever artificial-intelligence tool that aimed to fully automate the scientific process. ‘AI scientist’, created by Sakana AI, a company based in Tokyo, can perform the full cycle of scientific discovery, from idea generation to testing its ideas to writing them up in a scientific paper. Almost two years on, there are many different AI research assistants available for researchers to use, and a few of them are designed to do autonomous research in the same way that AI Scientist does.

Now, AI Scientist is thought to be one of the first such tools to go through the peer-review process at a leading academic journal. The paper, published today in Nature1, updates a 2024 preprint that described the tool2, including by toning down its reported capabilities. The latest paper also describes how AI Scientist submitted three original research papers generated by the tool to a leading machine-learning conference, one of which was accepted by peer reviewers. Still, Sakana AI co-founder David Ha says that the paper is not at the same level of the best human-produced papers that were accepted by the same conference.

When the system was unveiled in 2024, “we were the first group to have demonstrated that you could have AI do the entire scientific process that underlies the creation of a paper”, says study co-author Jeff Clune, a computer scientist at the University of British Columbia in Vancouver, Canada.

“I think it’s a remarkable technological achievement,” says Jevin West, a computational social scientist at the University of Washington in Seattle, which “really forces us to think about what science is”. However, automating science comes with several risks, adds West. These include journals and conferences being flooded with papers of modest originality, he says.

Human review

AI Scientist is a collection of ‘agents’ built on top of existing large language models (LLMs), such as GPT-4o or Claude Sonnet 4. It prompts those LLMs to search the literature on a given topic, generate hypotheses and design a set of possible research directions. Next, AI Scientist writes code, executes it and measures its efficiency. Finally, it writes a paper describing the results. The authors of the paper also created an ‘automated reviewer’ to evaluate the quality of the tool’s output. The results “approach borderline acceptability for machine learning conference workshops”, the authors write.

In response to peer-review comments, the Sakana AI team toned down claims made in the original preprint that AI Scientist automated the “entire” scientific process. The Nature paper also describes the AI Scientist project — including how the tool took part in a controlled peer-review experiment at the International Conference on Learning Representations (ICLR) in April 2025. In agreement with the ICLR organizers, Ha and his collaborators submitted three AI-generated papers to the peer-review board for one of the conference’s workshops.

AI bots wrote and reviewed all papers at this conference

“Reviewers were told ahead of time that some fraction of the papers might be AI-generated,” says Clune. One of the papers was accepted, in what creators of AI Scientist say was the first time that an autonomous system passed a ‘Turing test’ for an AI-generated paper — a test designed to check whether human reviewers find the paper indistinguishable from those written by human researchers.

... continue reading