Skip to content
Tech News
← Back to articles

Show HN: Mnemo – local-first AI memory layer for any LLM (Rust, SQLite,petgraph)

read original get AI Memory Storage Kit → more articles
Why This Matters

Mnemo introduces a local-first AI memory layer that enhances large language models by persistently storing and retrieving contextual knowledge through a local SQLite-based graph. This approach addresses the common issue of LLMs forgetting previous interactions, enabling more coherent and context-aware conversations without relying on cloud services. Its seamless integration and rapid processing make it a valuable tool for developers seeking privacy, speed, and improved AI performance.

Key Takeaways

mnemo

Local-first AI memory layer for any LLM. Persistent knowledge graph, entity extraction, semantic retrieval — no cloud required.

What is mnemo?

Most LLMs forget everything the moment a conversation ends. mnemo fixes that.

mnemo is a sidecar service that watches every conversation you feed it, extracts named entities and relationships using an LLM, builds a persistent knowledge graph in SQLite, and injects relevant context back into future prompts — automatically, in under 50ms. It works with Ollama (fully local, free), OpenAI, Anthropic, or any OpenAI-compatible API. It ships as a single static binary with zero cloud dependency.

How it works

your app │ ▼ POST /ingest ──► entity extraction (LLM) ──► knowledge graph (SQLite + petgraph) │ POST /retrieve ◄── scoring + ranking ◄── graph traversal + full-text search │ ▼ context_prompt ──► inject into your LLM prompt

You POST raw text to /ingest (a conversation turn, a document, a note). mnemo sends it to your configured LLM and extracts entities (people, tools, places, concepts) and the relationships between them. Entities are deduplicated by name+type, aliases are merged, and everything is written to SQLite. The in-memory petgraph is updated atomically. On POST /retrieve , mnemo runs a 6-stage pipeline: full-text chunk search → entity name search → graph expansion (BFS over the knowledge graph) → relation filter → score+rank → assemble a context_prompt string. You inject context_prompt into your LLM's system prompt. Done.

Quickstart

Path A — Docker + Ollama (fully free, recommended)

... continue reading