Skip to content
Tech News
← Back to articles

How Lume Works: The Retrieval Primitives

read original more articles
Why This Matters

Lume introduces a transparent, local-first search engine that combines lexical, semantic, and graph-based retrieval primitives, empowering developers to build more inspectable and customizable agentic systems. Its design emphasizes openness, modularity, and auditability, making it a significant step toward more transparent AI and search solutions in the tech industry. This approach benefits consumers by enabling more trustworthy and adaptable search experiences.

Key Takeaways

Lume is a Rust hybrid search engine that Steve Harris and I have been building in the open at github.com/DeepBlueDynamics/lume . It’s a small CLI plus an MCP server, BSD-3 licensed, and built around a stubborn idea: when an agent asks a question, every step from query to evidence should be inspectable.

Lume indexes Markdown, source code, and PDFs (via a small Python extractor) and ranks over them with three independent primitives — field-aware BM25, dense GTR-T5 vectors via Shivvr, and a significance-scored entity graph. The lexical core and the graph run entirely on your machine; only the dense vectors call out, and that endpoint defaults to localhost . There is no opaque “search box that returns a ranking” — every score has a name, a file, and a knob.

This post walks Lume’s retrieval core end to end, with line-level references to the current tree. If you’re building agentic systems and tired of treating retrieval as a magic step, this is for you.

A few principles up front, because they explain the design:

Local-first. Lexical search and the entity graph run entirely on your machine. Dense vectors are fetched from Shivvr through SHIVVR_BASE_URL , which defaults to a local endpoint.

Lexical search and the entity graph run entirely on your machine. Dense vectors are fetched from Shivvr through , which defaults to a local endpoint. Layered, not monolithic. BM25, semantic, and graph are independent signals with their own scores. The blend is one line; each input is replaceable.

BM25, semantic, and graph are independent signals with their own scores. The blend is one line; each input is replaceable. Auditable. The engine prints what it pruned, what it ranked, and why it rejected the rest.

0. The unit of retrieval: a Section

Lume indexes Markdown, cut into sections at # headers ( parse_markdown in src/bm25.rs:211 ). A Section ( src/bm25.rs:106 ) is the atom everything ranks over:

pub struct Section { pub title: String, pub body: String, pub line_number: usize, pub filename: Option<String>, pub entities: Vec<String>, // resolved named entities, for the graph }

... continue reading