Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models
The paper is now on arxiv and check out our demo!
This repository contains the data preprocessing pipeline, finetuning scripts, memorization evaluation code, and analysis scripts for our paper.
We provide partial example files in data/ containing a small subset of excerpts and generations from The Road by Cormac McCarthy. Full book content and model generations are not included because the books are copyrighted and the generations contain large portions of verbatim text.
Setup
We use uv for dependency management. Install uv if you haven't already:
curl -LsSf https://astral.sh/uv/install.sh | sh
Create a virtual environment and install all dependencies:
uv venv --python 3.11 source .venv/bin/activate uv pip install html2text natsort ftfy openai tqdm nltk numpy
For Gemini finetuning and generation, also install:
... continue reading