Artificial intelligence companies don't need permission from authors to train their large language models (LLMs) on legally acquired books, US District Judge William Alsup ruled Monday.
The first-of-its-kind ruling that condones AI training as fair use will likely be viewed as a big win for AI companies, but it also notably put on notice all the AI companies that expect the same reasoning will apply to training on pirated copies of books—a question that remains unsettled.
In the specific case that Alsup is weighing—which pits book authors against Anthropic—Alsup found that "the purpose and character of using copyrighted works to train LLMs to generate new text was quintessentially transformative" and "necessary" to build world-class AI models.
Importantly, this case differs from other lawsuits where authors allege that AI models risk copying and distributing their work. Because authors suing Anthropic did not allege that any of Anthropic's outputs reproduced their works or expressive style, Alsup found there was no threat that Anthropic's text generator, Claude, might replace authors in their markets. And that lacking argument did tip the fair use analysis in favor of Anthropic.
"Like any reader aspiring to be a writer, Anthropic’s LLMs trained upon works not to race ahead and replicate or supplant them—but to turn a hard corner and create something different," Alsup wrote.
Alsup's ruling surely disappointed authors, who instead argued that Claude's reliance on their texts could generate competing summaries or alternative versions of their stories. The judge claimed these complaints were akin to arguing "that training schoolchildren to write well would result in an explosion of competing works."
"This is not the kind of competitive or creative displacement that concerns the Copyright Act," Alsup wrote. "The Act seeks to advance original works of authorship, not to protect authors against competition."