Skip to content
Tech News
← Back to articles

Alignment whack-a-mole: Finetuning activates recall of copyrighted books in LLMs

read original get AI Textbook on Fine-tuning → more articles
Why This Matters

This research highlights the risks of large language models memorizing copyrighted material, such as verbatim excerpts from books, during finetuning. Understanding this phenomenon is crucial for developing ethical AI systems that respect intellectual property rights and avoid legal issues. It also underscores the need for improved model training techniques to mitigate unintended memorization.

Key Takeaways

Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models

The paper is now on arxiv and check out our demo!

This repository contains the data preprocessing pipeline, finetuning scripts, memorization evaluation code, and analysis scripts for our paper.

We provide partial example files in data/ containing a small subset of excerpts and generations from The Road by Cormac McCarthy. Full book content and model generations are not included because the books are copyrighted and the generations contain large portions of verbatim text.

Setup

We use uv for dependency management. Install uv if you haven't already:

curl -LsSf https://astral.sh/uv/install.sh | sh

Create a virtual environment and install all dependencies:

uv venv --python 3.11 source .venv/bin/activate uv pip install html2text natsort ftfy openai tqdm nltk numpy

For Gemini finetuning and generation, also install:

... continue reading