Why do LLMs make stuff up? New research peers under the hood.
Published on: 2025-05-23 18:33:51
One of the most frustrating things about using a large language model is dealing with its tendency to confabulate information, hallucinating answers that are not supported by its training data. From a human perspective, it can be hard to understand why these models don't simply say "I don't know" instead of making up some plausible-sounding nonsense.
Now, new research from Anthropic is exposing at least some of the inner neural network "circuitry" that helps an LLM decide when to take a stab at a (perhaps hallucinated) response versus when to refuse an answer in the first place. While human understanding of this internal LLM "decision" process is still rough, this kind of research could lead to better overall solutions for the AI confabulation problem.
When a “known entity” isn't
In a groundbreaking paper last May, Anthropic used a system of sparse auto-encoders to help illuminate the groups of artificial neurons that are activated when the Claude LLM encounters internal concepts ra
... Read full article.