The basic promise of a query optimizer is that it picks the “optimal” query plan. But there’s a catch - the plan selection relies on cost estimates, calculated from selectivity estimates and cost of basic resources (I/O, CPU, …). So the question is, how often do we actually pick the “fastest” plan? And the truth is we actually make mistakes quite often.
Consider the following chart, with durations of a simple SELECT query with a range condition. The condition is varied to match different fractions of the table, shown on the x-axis (fraction of pages with matching rows). The plan is forced to use different scan methods using enable_ options, and the dark points mark runs when the scan method “won” even without using the enable_ parameters.
It shows that for selectivities ~1-5% (the x-axis is logarithmic), the planner picks an index scan, but this happens to be a poor choice. It takes up to ~10 seconds, and a simple “dumb” sequential scan would complete the query in ~2 seconds.
This is a good demonstration of why it’s pointless to try to get index scans everywhere, which application developers sometimes aim for.
The 1% selectivity is fairly high, but for lower selectivities it’s not much different. It’s not very visible on the chart, but let’s switch the y-axis to log scale, to see low selectivities better:
The index scans never win for this query! The bitmap scans dominate, even for the most selective conditions, by a factor of ~10x. The planner does correctly pick the bitmap scans, so that’s good.
Note: The reason is fairly simple - bitmap scans perform prefetching, index scans don’t. Disabling prefetching (or using my WIP patch with index prefetching) eliminates the difference.
The above charts are for a uniform data set, which is a bit extreme. On the one hand it’s the simplest to calculate estimates for. We even assume uniformity / independence in some places. On the other hand, it has the worst data locality.
So what about some other data distributions, with different correlation patterns? I ran this with a number of datasets, and here’s a couple examples. I’ll show the regular and log scale chart for each.
Note: For details about the test, see create.sql SQL script (where all the data sets are created), and the main test script run-test.sh.
... continue reading