Overview
In the past week, two significant rulings from the Northern District of California addressed the application of the fair use doctrine to the use of copyrighted books in training large language models (LLMs). Both Meta Platforms Inc. and Anthropic PBC faced claims from authors alleging copyright infringement due to the use of their works in AI model development. In both cases, the courts found that the alleged use of copyrighted books for AI training constituted fair use under U.S. copyright law. These decisions mark a pivotal moment for the intersection of copyright law and AI, clarifying the boundaries of fair use in the context of transformative technological innovation. These rulings contrast with the Copyright Office’s preliminary report on the training of AI models on copyrighted materials that expressed skepticism as to the applicability of fair use to such training.1
Key Findings From the Rulings
1. AI Training as a Transformative Use
Both courts emphasized the highly transformative nature of using copyrighted works to train LLMs. The original purpose of the books was fundamentally different from their use in AI training, where the works serve as raw material to develop models capable of generating new, original text and performing a wide range of functions. This transformative character weighed heavily in favor of fair use under the first statutory factor.
2. Market Harm and the Burden of Proof
While acknowledging the potential for AI-generated content to impact the market for original works, the court in the Meta ruling found that the plaintiffs failed to provide sufficient evidence of actual or likely market harm. The court noted that, especially in cases involving highly transformative uses, plaintiffs must make a strong showing of market effect to overcome a fair use defense. Arguments centered on lost licensing revenue and the risk of LLMs regurgitating original works were deemed speculative and unsupported by the record.
3. No Automatic Right to License for AI Training
In the Meta ruling, the court also clarified that copyright holders are not entitled to monopolize transformative uses, such as AI training, simply by asserting a potential market for licensing such uses. The existence of a hypothetical licensing market for AI training does not, by itself, defeat a fair use defense, particularly where the secondary use does not directly substitute for the original work.
4. Scope and Amount of Copying
Both decisions recognized that copying entire works was reasonable and necessary for the transformative purposes of LLM training. The courts accepted that effective AI models require large volumes of high-quality text and that the amount and substantiality of the use must be evaluated in light of the intended transformative function.
5. Distinction Between Lawful and Unlawful Acquisition
In the Anthropic case, the court held that amassing and retaining pirated digital copies for a central library was not a transformative use and did not qualify as fair use. The acquisition and indefinite retention of pirated works, even if some were later used for training, was found to directly displace the market for the original works and was not justified by the potential for future transformative use. However, the court elaborated that this initial pirating and retention, while infringing, did not preclude the finding of fair use with respect to the AI training.
6. Conversion of Print Books to Digital Format
The Anthropic ruling also addressed the conversion of lawfully purchased print books to digital format for internal, nondistributive use. The court found this practice to be fair use, analogizing it to recognized fair uses such as microfilming for space-saving or recording broadcasts for personal use, provided there was no external distribution or market harm.
7. Limited Precedential Effect and Future Implications
Both courts stressed that their rulings are limited to the specific facts and parties before them and do not establish a blanket precedent for all AI training or copyright cases. The decisions leave open the possibility that future cases, particularly those with more-developed records on market effects or involving less transformative uses (such as news publications), could yield different outcomes.
Conclusion
These rulings provide important guidance for AI developers and copyright holders navigating the evolving landscape of fair use and AI model training. The decisions affirm that, under current law, the use of copyrighted works to train LLMs can be protected as fair use when the use is transformative, does not result in infringing outputs, and does not displace the market for the original works. However, the courts made clear that fair use does not excuse the initial act of piracy or the creation of unauthorized central libraries of copyrighted material.
AI companies should remain mindful that the fair use defense may not be available in all circumstances, particularly where evidence of market harm is more substantial or where the use is less transformative. Copyright holders, meanwhile, bear the burden of demonstrating concrete market harm in order to prevail against fair use defenses in the context of AI training.
The Meta and Anthropic decisions mark a significant development in the application of the fair use doctrine to AI model training. While these rulings strengthen the position of AI developers, they also underscore the need for careful, fact-specific analysis in future cases. The evolving legal landscape will continue to shape the boundaries of fair use as AI technologies advance. It also remains unclear whether higher courts will agree with these findings of fair use on appeal.
Those with additional questions about the legal concepts addressed in this client alert may contact the authors.
1 See Lowenstein Sandler LLP, "US Copyright Office Releases Long-Awaited Report on Generative AI Training and Copyright Law," May 12, 2025, available here.