So you wanna build a local RAG?

When we launched Skald, we wanted it to not only be self-hostable, but also for one to be able to run it without sending any data to third-parties.

With LLMs getting better and better, privacy-sensitive organizations shouldn't have to choose between being left behind by not accessing frontier models and doing away with their committment (or legal requirement) for data privacy.

So here's what we did to support this use case and also some benchmarks comparing performance when using proprietary APIs vs self-hosted open-source tech.

RAG components and their OSS alternatives

A basic RAG usually has the following core components:

A vector database

A vector embeddings model

An LLM

And most times it also has these as well:

A reranker

... continue reading