Avoid crash in partitionwise join planning under GEQO.

While trying to plan a partitionwise join, we may be faced with cases where one or both input partitions for a particular segment of the join have been pruned away. In HEAD and v11, this is problematic because earlier processing didn't bother to make a pruned RelOptInfo fully valid. With an upcoming patch to make partition pruning more efficient, this'll be even more problematic because said RelOptInfo won't exist at all. The existing code attempts to deal with this by retroactively making the RelOptInfo fully valid, but that causes crashes under GEQO because join planning is done in a short-lived memory context. In v11 we could probably have fixed this by switching to the planner's main context while fixing up the RelOptInfo, but that idea doesn't scale well to the upcoming patch. It would be better not to mess with the base-relation data structures during join planning, anyway --- that's just a recipe for order-of-operations bugs. In many cases, though, we don't actually need the child RelOptInfo, because if the input is certainly empty then the join segment's result is certainly empty, so we can skip making a join plan altogether. (The existing code ultimately arrives at the same conclusion, but only after doing a lot more work.) This approach works except when the pruned-away partition is on the nullable side of a LEFT, ANTI, or FULL join, and the other side isn't pruned. But in those cases the existing code leaves a lot to be desired anyway --- the correct output is just the result of the unpruned side of the join, but we were emitting a useless outer join against a dummy Result. Pending somebody writing code to handle that more nicely, let's just abandon the partitionwise-join optimization in such cases. When the modified code skips making a join plan, it doesn't make a join RelOptInfo either; this requires some upper-level code to cope with nulls in part_rels[] arrays. We would have had to have that anyway after the upcoming patch. Back-patch to v11 since the crash is demonstrable there. Discussion: https://postgr.es/m/[email protected]
author: Tom Lane 2019-03-30 16:48:19 +0000
committer: Tom Lane 2019-03-30 16:48:32 +0000
commit: 7ad6498fd5a654de6e743814c36cf619a3b5ddb6 (patch)
tree: 48f51e4afe4f6bae66b9a7993e6bafce724a1fde /src/test/regress/expected/partition_aggregate.out
parent: ef6576f5379edfa29bb4f99880b0f76dd315dd14 (diff)
1 files changed, 39 insertions, 59 deletions
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out
index 6bc106831ee..e1549cbb5c6 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -716,37 +716,33 @@ SELECT a.x, sum(b.x) FROM pagg_tab1 a FULL OUTER JOIN pagg_tab2 b ON a.x = b.y G
     |  500
 (16 rows)
 
--- LEFT JOIN, with dummy relation on right side,
+-- LEFT JOIN, with dummy relation on right side, ideally
 -- should produce full partitionwise aggregation plan as GROUP BY is on
--- non-nullable columns
+-- non-nullable columns.
+-- But right now we are unable to do partitionwise join in this case.
 EXPLAIN (COSTS OFF)
 SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a LEFT JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20  GROUP BY a.x, b.y ORDER BY 1, 2;
-                                 QUERY PLAN                                  
------------------------------------------------------------------------------
+                              QUERY PLAN                               
+-----------------------------------------------------------------------
  Sort
-   Sort Key: pagg_tab1_p1.x, y
-   ->  Append
-         ->  HashAggregate
-               Group Key: pagg_tab1_p1.x, y
-               ->  Hash Left Join
-                     Hash Cond: (pagg_tab1_p1.x = y)
-                     Filter: ((pagg_tab1_p1.x > 5) OR (y < 20))
+   Sort Key: pagg_tab1_p1.x, pagg_tab2_p2.y
+   ->  HashAggregate
+         Group Key: pagg_tab1_p1.x, pagg_tab2_p2.y
+         ->  Hash Left Join
+               Hash Cond: (pagg_tab1_p1.x = pagg_tab2_p2.y)
+               Filter: ((pagg_tab1_p1.x > 5) OR (pagg_tab2_p2.y < 20))
+               ->  Append
                      ->  Seq Scan on pagg_tab1_p1
                            Filter: (x < 20)
-                     ->  Hash
-                           ->  Result
-                                 One-Time Filter: false
-         ->  HashAggregate
-               Group Key: pagg_tab1_p2.x, pagg_tab2_p2.y
-               ->  Hash Left Join
-                     Hash Cond: (pagg_tab1_p2.x = pagg_tab2_p2.y)
-                     Filter: ((pagg_tab1_p2.x > 5) OR (pagg_tab2_p2.y < 20))
                      ->  Seq Scan on pagg_tab1_p2
                            Filter: (x < 20)
-                     ->  Hash
+               ->  Hash
+                     ->  Append
                            ->  Seq Scan on pagg_tab2_p2
                                  Filter: (y > 10)
-(23 rows)
+                           ->  Seq Scan on pagg_tab2_p3
+                                 Filter: (y > 10)
+(18 rows)
 
 SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a LEFT JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20  GROUP BY a.x, b.y ORDER BY 1, 2;
  x  | y  | count 
@@ -760,49 +756,33 @@ SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a LEFT JOI
  18 | 18 |   100
 (7 rows)
 
--- FULL JOIN, with dummy relations on both sides,
+-- FULL JOIN, with dummy relations on both sides, ideally
 -- should produce partial partitionwise aggregation plan as GROUP BY is on
--- nullable columns
+-- nullable columns.
+-- But right now we are unable to do partitionwise join in this case.
 EXPLAIN (COSTS OFF)
 SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a FULL JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20  GROUP BY a.x, b.y ORDER BY 1, 2;
-                                    QUERY PLAN                                     
------------------------------------------------------------------------------------
- Finalize GroupAggregate
-   Group Key: pagg_tab1_p1.x, y
-   ->  Sort
-         Sort Key: pagg_tab1_p1.x, y
-         ->  Append
-               ->  Partial HashAggregate
-                     Group Key: pagg_tab1_p1.x, y
-                     ->  Hash Full Join
-                           Hash Cond: (pagg_tab1_p1.x = y)
-                           Filter: ((pagg_tab1_p1.x > 5) OR (y < 20))
-                           ->  Seq Scan on pagg_tab1_p1
-                                 Filter: (x < 20)
-                           ->  Hash
-                                 ->  Result
-                                       One-Time Filter: false
-               ->  Partial HashAggregate
-                     Group Key: pagg_tab1_p2.x, pagg_tab2_p2.y
-                     ->  Hash Full Join
-                           Hash Cond: (pagg_tab1_p2.x = pagg_tab2_p2.y)
-                           Filter: ((pagg_tab1_p2.x > 5) OR (pagg_tab2_p2.y < 20))
-                           ->  Seq Scan on pagg_tab1_p2
-                                 Filter: (x < 20)
-                           ->  Hash
-                                 ->  Seq Scan on pagg_tab2_p2
-                                       Filter: (y > 10)
-               ->  Partial HashAggregate
-                     Group Key: x, pagg_tab2_p3.y
-                     ->  Hash Full Join
-                           Hash Cond: (pagg_tab2_p3.y = x)
-                           Filter: ((x > 5) OR (pagg_tab2_p3.y < 20))
+                              QUERY PLAN                               
+-----------------------------------------------------------------------
+ Sort
+   Sort Key: pagg_tab1_p1.x, pagg_tab2_p2.y
+   ->  HashAggregate
+         Group Key: pagg_tab1_p1.x, pagg_tab2_p2.y
+         ->  Hash Full Join
+               Hash Cond: (pagg_tab1_p1.x = pagg_tab2_p2.y)
+               Filter: ((pagg_tab1_p1.x > 5) OR (pagg_tab2_p2.y < 20))
+               ->  Append
+                     ->  Seq Scan on pagg_tab1_p1
+                           Filter: (x < 20)
+                     ->  Seq Scan on pagg_tab1_p2
+                           Filter: (x < 20)
+               ->  Hash
+                     ->  Append
+                           ->  Seq Scan on pagg_tab2_p2
+                                 Filter: (y > 10)
                            ->  Seq Scan on pagg_tab2_p3
                                  Filter: (y > 10)
-                           ->  Hash
-                                 ->  Result
-                                       One-Time Filter: false
-(35 rows)
+(18 rows)
 
 SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a FULL JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20 GROUP BY a.x, b.y ORDER BY 1, 2;
  x  | y  | count
author	Tom Lane	2019-03-30 16:48:19 +0000
committer	Tom Lane	2019-03-30 16:48:32 +0000
commit	7ad6498fd5a654de6e743814c36cf619a3b5ddb6 (patch)
tree	48f51e4afe4f6bae66b9a7993e6bafce724a1fde /src/test/regress/expected/partition_aggregate.out
parent	ef6576f5379edfa29bb4f99880b0f76dd315dd14 (diff)