Find Related products on Amazon

Shop on Amazon

I made a search engine worse than Elasticsearch (2024)

Published on: 2025-06-07 20:37:20

I want you to share in my shame at daring to make a search library. And in this shame, you too, can experience the humility and understanding of what a real, honest-to-goodness, not side-project, search engine does to make lexical search fast. BEIR is a set of Information Retrieval benchmarks, oriented around question-answer use cases. My side project, SearchArray adds full text search to Pandas. So naturally, to see stand in awe at my amazing developer skills, I wanted to use BEIR to compare SearchArray to Elasticsearch (w/ same query + tokenization). So I spent a Saturday integrating SearchArray into BEIR, and measuring its relevence and performance on MSMarco Passage Retrieval corpus (8M docs). … and 🥁 Library Elasticsearch SearchArray NDCG@10 0.2275 0.225 Search Throughput 90 QPS ~18 QPS Indexing Throughput 10K Docs Per Sec ~3.5K Docs Per Sec … Sad trombone 🎺 It’s worse in every dimension At least NDCG@10 is nearly right, so our BM25 calculation is correct (probably due to n ... Read full article.