In a new legal battle in the AI space, and CEO Mark Zuckerberg have been sued by five publishers and author Scott Turow, who allege the tech company illegally copied millions of books, articles and other works to train Meta’s artificial-intelligence systems.
“In their effort to win the AI ‘arms race’ and build a functional generative AI model, Defendants Meta and Zuckerberg followed their well-known motto: ‘move fast and break things,’” the plaintiffs say in their lawsuit. “They first illegally torrented millions of copyrighted books and journal articles from notorious pirate sites and downloaded unauthorized web scrapes of virtually the entire internet. They then copied those stolen fruits many times over to train Meta’s multibillion-dollar generative AI system called Llama. In doing so, Defendants engaged in one of the most massive infringements of copyrighted materials in history.”
The suit was filed Tuesday (May 5) in the U.S. District Court for the Southern District of New York by five publishers (Hachette, Macmillan, McGraw Hill, Elsevier and Cengage) and Turow individually. The proposed class-action suit seeks unspecific monetary damages for the alleged copyright infringement. A copy of the lawsuit is available at this link.
Popular on Variety
Asked for comment, a Meta spokesperson said, “AI is powering transformative innovations, productivity and creativity for individuals and companies, and courts have rightly found that training AI on copyrighted material can qualify as fair use. We will fight this lawsuit aggressively.”
Authors have sued AI companies for copyright infringement before — and lost.
For example, in June 2025, a federal judge rejected a claim brought by 13 authors, including Sarah Silverman and Junot Díaz, that Meta violated their copyrights by training its AI model on their books. Judge Vincent Chhabria ruled that Meta had engaged in “fair use” when it used a data set of nearly 200,000 books to train its Llama language model for generative AI.
But the latest lawsuit alleges that Meta and Zuckerberg deliberately circumvented copyright-protection mechanisms — and had considered paying to license the works before abandoning that strategy at “Zuckerberg’s personal instruction.” The suit essentially argues that the conduct described falls outside protections afforded by fair-use provisions of the U.S. copyright code.
“Meta — at Zuckerberg’s direction — copied millions of books, journal articles, and other written works without authorization, including those owned or controlled by Plaintiffs and the Class, and then made additional copies of those works to train Llama,” the suit says. “Zuckerberg himself personally authorized and actively encouraged the infringement. Meta also stripped [copyright management information] from the copyrighted works it stole. It did this to conceal its training sources and facilitate their unauthorized use.”
According to the lawsuit, after the release of Llama 1, Meta briefly considered entering into licensing deals with major publishers. Meta discussed increasing the company’s “dataset licensing” budget to as much as $200 million from January to April 2023, per the complaint.
... continue reading