Subliminal learning: Models transmit behaviors via hidden signals in data
Alex Cloud*1, Minh Le*1, July 22, 2025 James Chua2, Jan Betley2, Anna Sztyber-Betley3, Jacob Hilton4, Samuel Marks5, Owain Evans2,6 *Equal contribution; author order chosen randomly 1Anthropic Fellows Program; 2Truthful AI; 3Warsaw University of Technology; 4Alignment Research Center; 5Anthropic; 6UC Berkeley Anthropic Fellows Program;Truthful AI;Warsaw University of Technology;Alignment Research Center;Anthropic;UC Berkeley tl;dr We study subliminal learning, a surprising phenomenon where lan