Short-Circuiting Correlated Subqueries in SQLite
17 Dec, 2025
I recently added domain exclusion lists and paywalled content filtering to Scour. This blog post describes a small but useful SQL(ite) query optimization I came across between the first and final drafts of these features: using an uncorrelated scalar subquery to skip a correlated subquery (if you don't know what that means, I'll explain it below).
Scour searches noisy sources for content related to users' interests. At the time of writing, it ingests between 1 and 3 million pieces of content from over 15,000 sources each month. For better and for worse, Scour does ranking on the fly, so the performance of the ranking database query directly translates to page load time.
The Ranking SQL Query
The main SQL query Scour uses for ranking applies a number of filters and streams the item embeddings through the application code for scoring.
Scour uses brute force search rather than a vector database, which works well enough for now because of three factors:
Scour uses SQLite, so the data is colocated with the application code. It uses binary-quantized vector embeddings with Hamming Distance comparisons, which only take ~5 nanoseconds each. We care most about recent posts so we can significantly narrow the search set by publish date.
A simplified version of the query looks something like:
SELECT * FROM items i WHERE i . lang IN ( SELECT lang FROM user_languages WHERE user_id = ? 1 ) AND i . published BETWEEN ? 2 AND ? 3 AND ...( more filters )...
... continue reading