Summary
Update 2025-12-02:
Amanda Askell has kindly confirmed that the document was used in supervised learning and will share the full version and more details soon.
I would request that the current extracted version should not be completely taken at face-value, as it's fuzzy and may not be accurate to the ground truth version. Also since some parts may only make sense when put in context.
As far as I understand and uncovered, a document for the character training for Claude is compressed in Claude's weights. The full document can be found at the "Anthropic Guidelines" heading at the end. The Gist with code, chats and various documents (including the "soul document") can be found here:
Claude 4.5 Opus Soul Document
I apologize in advance for this not exactly a regular lw post, but I thought an effort-post may fit here the best.
A strange hallucination, or is it?
While extracting Claude 4.5 Opus' system message on its release date, as one does, I noticed an interesting particularity.
I'm used to models, starting with Claude 4, to hallucinate sections in the beginning of their system message, but Claude 4.5 Opus in various cases included a supposed "soul_overview" section, which sounded rather specific:
... continue reading