every agent memory library uses the same words: episodic , semantic , sometimes procedural . they’re cognitive science’s vocabulary, lifted into the API. the engineering often isn’t lifted with them. a library can have a procedural field that uses the same storage and retrieval as semantic — a label, not a separate system. the deeper slip is the word memory itself: most of what these libraries build is narrower than that, and the narrower term sharpens the problem.
the terminology comes from a 1972 chapter by Endel Tulving.1 he argued that what people had been treating as one thing — memory — was at least two: memory for events (what happened, where, when), and memory for facts (the capital of France, water’s boiling point). he called them episodic and semantic. they behave differently and they fail differently.
most of what these libraries call “memory” is narrower than the word suggests: not a full cognitive memory system, but autobiographical content about the user held on the user’s behalf — where they live, what they’re working on, what they’ve decided.
the anatomy of an agent memory system
an agent memory library is built from a small number of components. you can read any library’s docs by knowing the parts.
the extractor. the thing that reads conversation transcripts and decides what to keep. usually an LLM call, sometimes with a strict prompt or a typed output schema. it produces statements — short, abstracted facts about the user, the world, or the task.
the most consequential choice an extractor makes is timing. extract eagerly, after every message, and you spend tokens on small talk that goes nowhere. extract lazily, at the end of a session, and the context you needed to resolve a pronoun is already gone. neither timing is wrong; each loses something the other keeps. the question worth asking of any library is what gets thrown away — coreference cues (which “he” refers to which person), temporal anchors (“yesterday,” “next week”), and disambiguating local context are common casualties. extraction is, in cognitive terms, a compression from situated event to decontextualized fact: user mentioned over coffee on Tuesday that they prefer TypeScript becomes user prefers TypeScript. how aggressively a library compresses is one of its central design decisions.
the store. the database. one or more of: a vector index (entries indexed by semantic similarity), a relational table (entries indexed by columns you can filter on), a knowledge graph (entries connected by typed edges). each statement carries metadata — a timestamp, sometimes a confidence score, sometimes a source pointer back to the original conversation.
the hardest question a store answers isn’t where to put things. it’s what to do when a new statement contradicts an old one. the user lived in Paris until April, then moved to Amsterdam — and the store now has both, each presenting as current. the choice is whether to
overwrite (one truth, no history)
... continue reading