Better JIT for Postgres

A lightweight JIT compilation provider for PostgreSQL that adds three alternative JIT backends - sljit, AsmJit and MIR - delivering faster compilation and competitive query execution across PostgreSQL 14–18.

JIT compilation was introduced in Postgres 11 in 2018. It solves a problem of Postgres having to interpret expressions and use inefficient per-row loops in run-time in order to do internal data conversions (so-called tuple deforming). On expression-heavy workloads or just wide tables, it can give a significant performance boost for those operations. However, standard LLVM-based JIT is notoriously slow at compilation. When it takes tens to hundreds of milliseconds, it may be suitable only for very heavy, OLAP-style queries, in some cases. For typical OLTP queries, LLVM's JIT overhead can easily exceed the execution time of the query itself. pg_jitter provides native code generation with microsecond-level compilation times instead of milliseconds, making JIT worthwhile for a much wider range of queries.

Performance

Typical compilation time:

sljit : tens to low hundreds of microseconds

: tens to low hundreds of microseconds AsmJIT : hundreds of microseconds

: hundreds of microseconds MIR : hundreds of microseconds to single milliseconds

: hundreds of microseconds to single milliseconds LLVM (Postgres default): tens to hundreds of milliseconds (seconds in the worst cases)

In reality, the effect of JIT compilation is broader - execution can slow down for up to ~1ms even for sljit, because of other related things, mostly cold processor cache and effects of increased memory pressure (rapid allocations / deallocations related to code generation and JIT compilation). Therefore, on systems executing a lot of queries per second, it's recommended to avoid JIT compilation for very fast queries such as point lookups or queries processing only a few records. By default, jit_above_cost parameter is set to a very high number (100'000). This makes sense for LLVM, but doesn't make sense for faster providers. It's recommended to set this parameter value to something from ~200 to low thousands for pg_jitter (depending on what specific backend you use and your specific workloads).

sljit is the most consistent: 5–25% faster than the interpreter across all workloads. This, and also its phenomenal compilation speed, make it the best choice for most scenarios.

... continue reading