Tech News
← Back to articles

450× Faster Joins with Index Condition Pushdown

read original related products more articles

Introduction

Readyset is designed to serve queries from cached views with sub-millisecond latency. This post focuses on the cold path—cases where a cache miss forces execution against the base tables. In these scenarios, Readyset must evaluate the query from scratch, including materializing intermediate results. The focus here is on straddled joins, where filtering predicates apply to both sides of the join in addition to the ON clause.

Example:

SELECT u.id, u.name, o.order_id, o.total FROM users u JOIN orders o ON u.id = o.user_id WHERE u.email = '[email protected]' AND o.status = 'SHIPPED';

In a previous optimization, we transitioned from nested loop joins to hash joins (see https://readyset.io/blog/introducing-hash-join-algorithm), eliminating the need to repeatedly scan one side of the join. This brought significant performance improvements for many workloads, however, some queries remained problematic.

Straddled joins are common in production workloads; for example, filtering users by some property on the left table and orders by a status flag on the right table, then joining on user_id. While our hash join algorithm improved over nested loops, it was still inefficient in these cases, especially when one side’s predicate had low cardinality (like a boolean flag or status column).

This post explains why the old execution strategy struggled, and how our new optimization using Index Condition Pushdown (ICP) changes the execution model to make straddled joins much more efficient.

Previous algorithm

During evaluation of straddled joins on cache misses, we observed significant latency in up-queries. To identify the underlying bottleneck, we profiled query execution and analyzed the resulting flamegraphs

Approximately 30% of the query execution time was attributed to data decompression. We initially suspected the RocksDB compression algorithm as the bottleneck and proceeded to benchmark alternative compression methods to validate this hypothesis.

... continue reading