Query Engines: Push vs. Pull (2021)
Published on: 2025-04-21 18:15:59
Query Engines: Push vs. Pull
26 Apr 2021
People talk a lot about “pull” vs. “push” based query engines, and it’s pretty obvious what that means colloquially, but some of the details can be a bit hard to figure out.
Important people clearly have thought hard about this distinction, judging by this paragraph from Snowflake’s Sigmod paper:
Push-based execution refers to the fact that relational operators push their results to their downstream operators, rather than waiting for these operators to pull data (classic Volcano-style model). Push-based execution improves cache efficiency, because it removes control flow logic from tight loops. It also enables Snowflake to efficiently process DAG-shaped plans, as opposed to just trees, creating additional opportunities for sharing and pipelining of intermediate results.
And…that’s all they really have to say on the matter. It leaves me with two major unanswered questions:
Why does a push-based system “enable Snowflake to efficiently proces
... Read full article.