On expressions' reordering in Postgres

Let's discuss micro-optimisations

Apr 22, 2025

Today, I would like to discuss additional techniques to speed up query execution. Specifically, I will focus on rearranging conditions in filter expressions, JOINs, HAVING clauses, and similar constructs. The main idea is that if you encounter a negative result in one condition within a series of expressions connected by the AND operator, or a positive result in one of the conditions linked by the OR operator, you can avoid evaluating the remaining conditions. This can save computing resources. Below, I will explain how much execution efforts this approach saves and how to implement it effectively.

Occasionally, you may come across queries featuring complex filters similar to the following:

SELECT * FROM table
WHERE
  date > min_date AND
  date < now() - interval '1 day' AND
  value IN Subplan AND
  id = 42';

And in practice, it happens that a simple rearrangement of the order of conditions in such an expression allows for speeding up (sometimes quite notably) the query execution time. Why? Each individual operation costs little. However, if it is performed repeatedly on each of millions of the table's rows, then the price of the operation becomes palpable. Especially if other problems, like the table blocks getting into shared buffers, are successfully solved.

This effect is particularly evident on wide tables that contain many variable-length columns. For instance, I often encounter slow IndexScans that become slow when the field used for additional filtering is located somewhere around the 20th (!) position in the table, containing many variable-width columns. Accessing this field requires calculating its offset from the beginning of the row, which takes up processor time and slows down the execution.

The PostgreSQL community has already addressed this issue, as observed in the code. In 2002, commit 3779f7f, which was added by T. Lane, reorganised the clauses by positioning all clauses containing subplans at the end of the clause list (see order_qual_clauses). This change was logical because the cost of evaluating a subplan can depend on the parameters passed to it, introducing an additional source of error.

In 2007, this approach evolved with the commit 5a7471c, which established that the sorting of clauses would be performed exclusively in ascending order based on the cost parameter. This logic has remained in place to the present day, except for a minor modification in commit 215b43c, which required controlling the order of expression evaluation in each query plan node due to changes in the Row-Level Security (RLS) code.

Now, let’s take a look at what we have in the upstream as of today:

CREATE TABLE test (
  x integer, y numeric,
  w timestamp DEFAULT CURRENT_TIMESTAMP, z integer);
INSERT INTO test (x,y)
  SELECT gs,gs FROM generate_series(1,1E3) AS gs;
VACUUM ANALYZE test;

EXPLAIN (COSTS ON)
SELECT * FROM test
WHERE
  z > 0 AND
  w > now() AND
  x < (SELECT avg(y)
    FROM generate_series(1,1E2) y WHERE y%2 = x%3) AND
  x NOT IN (SELECT avg(y)
    FROM generate_series(1,1E2) y OFFSET 0) AND
  w IS NOT NULL AND
  x = 42;

Looking into the filter of this SELECT, we see the following sequence of conditions:

Filter: ((w IS NOT NULL) AND (z > 0) AND
         (x = 42) AND (w > now()) AND
         ((x)::numeric = (InitPlan 2).col1) AND
         ((x)::numeric < (SubPlan 1)))

During the execution of the query, they will be calculated in strict sequence from left to right. The operator costs are as follows for reference:

"z > 0" - 0.0025
"w > now()" - 0.005
"x < SubPlan 1" - 2.0225
"x NOT IN SubPlan 2" - 0.005
“w IS NOT NULL" - 0.0
“x = 42“ - 0.0025

This order appears quite logical. However, you may be wondering what can be improved here.

There are at least two straightforward opportunities for enhancement. First, you can assign a small cost to the ordinal position of each column involved in the expression. In simple terms, the further a column is to the right in a table row, the more expensive it is to evaluate. The cost should not be excessively high; it merely needs to signal to the optimiser that the expression x=42 is cheaper to evaluate than z>0, assuming all other factors are equal.

You may argue it is related to the current Postgres row-based storage. It is true, but we use this type of storage more frequently, isn't it? Moreover, it would make sense for storage to provide its own cost model.

The second standard pattern relates to pairs of expressions with approximately the exact cost. For instance, consider x=42 and z<50. Clearly, the second expression is less selective and should be placed in the second position. Since the expression x=42 will be true in fewer cases, there will be less need to evaluate subsequent conditions further down the list.

Now, let's assess the potential impact of these optimisations. Is it worth the effort? To illustrate, we can create a table where a pair of columns has the same selectivity but is positioned far apart, while another pair is located next to each other but has different selectivity.

CREATE TEMP TABLE test_2 (x1 numeric, x2 numeric,
  x3 numeric, x4 numeric);
INSERT INTO test_2 (x1,x2,x3,x4)
  SELECT x,(x::integer)%2,(x::integer)%100,x FROM
    (SELECT random()*1E7 FROM generate_series(1,1E7) AS x) AS q(x);
ANALYZE;

Let's examine the performance impact of searching for a value in a relatively "wide" row. Columns x1 and x4 are identical in every way, except that the position of the value in the column x1 is known in advance. In contrast, the position of the value in the column x4 needs to be calculated for each row.

EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF)
SELECT * FROM test_2 WHERE x1 = 42 AND x4 = 42;
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF)
SELECT * FROM test_2 WHERE x4 = 42 AND x1 = 42;

/*
 Seq Scan on test_2  (actual rows=0.00 loops=1)
   Filter: ((x1 = '42'::numeric) AND (x4 = '42'::numeric))
   Buffers: local read=94357
  Execution Time: 2372.032 ms

 Seq Scan on test_2  (actual rows=0.00 loops=1)
   Filter: ((x4 = '42'::numeric) AND (x1 = '42'::numeric))
   Buffers: local read=94357
 Execution Time: 2413.633 ms
*/

It turns out that, all other factors being equal, even a relatively short tuple can have an effect of about 2-3%. This impact is quite comparable to the typical benefits gained from using Just-In-Time (JIT) compilation. Now, let's consider the influence of selectivity. The columns x1 and x2 are positioned next to each other. The key difference is that the values in x1 are almost unique, whereas x2 contains mostly duplicated values.

EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF)
SELECT * FROM test_2 WHERE x2 = 1 AND x1 = 42;
EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF)
SELECT * FROM test_2 WHERE x1 = 42 AND x2 = 1;
/*
 Seq Scan on test_2  (actual rows=0.00 loops=1)
   Filter: ((x2 = '1'::numeric) AND (x1 = '42'::numeric))
   Buffers: local read=74596
 Execution Time: 2363.903 ms

 Seq Scan on test_2  (actual rows=0.00 loops=1)
   Filter: ((x1 = '42'::numeric) AND (x2 = '1'::numeric))
   Buffers: local read=74596
 Execution Time: 2034.873 ms
*/

It seems we have achieved a speedup of approximately 10%.

It turns out that if we accept that the effect can accumulate throughout the plan tree, which may contain multiple scanning operators as well as joins, each contributing a particular percentage, then this technique is worthwhile to implement overall, especially considering the minimal overhead in the planning phase.

Let's proceed with the implementation and observe its effects. Creating it as an extension doesn't seem practical, as there is currently no hook that allows for operations during the creation of the plan. As for me, the necessity of introducing a create_plan_hook within the create_plan() routine is becoming increasingly evident: We may let extensions transfer some data from the optimisation stage to the plan, as well as do some additional plan enhancements (which may fit a specific load), like proposed here. However, this topic has yet to be discussed within the PostgreSQL community.

If this feature is implemented as a patch, modifications will be needed in two areas of the code: cost_qual_eval(), where the cost of expressions is evaluated, and order_qual_clauses(), which defines the sorting rules for expressions. As usual, the code can be found on GitHub in the designated branch.

Running the aforementioned examples on this branch will demonstrate that the expressions are constructed more optimally, considering column order and selectivity. Additionally, no significant overhead is observed.

Do you think it makes sense to pursue such micro-optimisations, or should we aim for broader improvements? Have you encountered similar issues? Please share your thoughts in the comments.

THE END
April 19, 2025. Madrid, Spain.

Conserving CPU's cycles ...

Discussion about this post