diff options
author | Tom Lane | 2019-03-30 16:48:19 +0000 |
---|---|---|
committer | Tom Lane | 2019-03-30 16:48:32 +0000 |
commit | 7ad6498fd5a654de6e743814c36cf619a3b5ddb6 (patch) | |
tree | 48f51e4afe4f6bae66b9a7993e6bafce724a1fde /src/test/regress/expected/partition_aggregate.out | |
parent | ef6576f5379edfa29bb4f99880b0f76dd315dd14 (diff) |
Avoid crash in partitionwise join planning under GEQO.
While trying to plan a partitionwise join, we may be faced with cases
where one or both input partitions for a particular segment of the join
have been pruned away. In HEAD and v11, this is problematic because
earlier processing didn't bother to make a pruned RelOptInfo fully
valid. With an upcoming patch to make partition pruning more efficient,
this'll be even more problematic because said RelOptInfo won't exist at
all.
The existing code attempts to deal with this by retroactively making the
RelOptInfo fully valid, but that causes crashes under GEQO because join
planning is done in a short-lived memory context. In v11 we could
probably have fixed this by switching to the planner's main context
while fixing up the RelOptInfo, but that idea doesn't scale well to the
upcoming patch. It would be better not to mess with the base-relation
data structures during join planning, anyway --- that's just a recipe
for order-of-operations bugs.
In many cases, though, we don't actually need the child RelOptInfo,
because if the input is certainly empty then the join segment's result
is certainly empty, so we can skip making a join plan altogether. (The
existing code ultimately arrives at the same conclusion, but only after
doing a lot more work.) This approach works except when the pruned-away
partition is on the nullable side of a LEFT, ANTI, or FULL join, and the
other side isn't pruned. But in those cases the existing code leaves a
lot to be desired anyway --- the correct output is just the result of
the unpruned side of the join, but we were emitting a useless outer join
against a dummy Result. Pending somebody writing code to handle that
more nicely, let's just abandon the partitionwise-join optimization in
such cases.
When the modified code skips making a join plan, it doesn't make a
join RelOptInfo either; this requires some upper-level code to
cope with nulls in part_rels[] arrays. We would have had to have
that anyway after the upcoming patch.
Back-patch to v11 since the crash is demonstrable there.
Discussion: https://postgr.es/m/[email protected]
Diffstat (limited to 'src/test/regress/expected/partition_aggregate.out')
-rw-r--r-- | src/test/regress/expected/partition_aggregate.out | 98 |
1 files changed, 39 insertions, 59 deletions
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out index 6bc106831ee..e1549cbb5c6 100644 --- a/src/test/regress/expected/partition_aggregate.out +++ b/src/test/regress/expected/partition_aggregate.out @@ -716,37 +716,33 @@ SELECT a.x, sum(b.x) FROM pagg_tab1 a FULL OUTER JOIN pagg_tab2 b ON a.x = b.y G | 500 (16 rows) --- LEFT JOIN, with dummy relation on right side, +-- LEFT JOIN, with dummy relation on right side, ideally -- should produce full partitionwise aggregation plan as GROUP BY is on --- non-nullable columns +-- non-nullable columns. +-- But right now we are unable to do partitionwise join in this case. EXPLAIN (COSTS OFF) SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a LEFT JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20 GROUP BY a.x, b.y ORDER BY 1, 2; - QUERY PLAN ------------------------------------------------------------------------------ + QUERY PLAN +----------------------------------------------------------------------- Sort - Sort Key: pagg_tab1_p1.x, y - -> Append - -> HashAggregate - Group Key: pagg_tab1_p1.x, y - -> Hash Left Join - Hash Cond: (pagg_tab1_p1.x = y) - Filter: ((pagg_tab1_p1.x > 5) OR (y < 20)) + Sort Key: pagg_tab1_p1.x, pagg_tab2_p2.y + -> HashAggregate + Group Key: pagg_tab1_p1.x, pagg_tab2_p2.y + -> Hash Left Join + Hash Cond: (pagg_tab1_p1.x = pagg_tab2_p2.y) + Filter: ((pagg_tab1_p1.x > 5) OR (pagg_tab2_p2.y < 20)) + -> Append -> Seq Scan on pagg_tab1_p1 Filter: (x < 20) - -> Hash - -> Result - One-Time Filter: false - -> HashAggregate - Group Key: pagg_tab1_p2.x, pagg_tab2_p2.y - -> Hash Left Join - Hash Cond: (pagg_tab1_p2.x = pagg_tab2_p2.y) - Filter: ((pagg_tab1_p2.x > 5) OR (pagg_tab2_p2.y < 20)) -> Seq Scan on pagg_tab1_p2 Filter: (x < 20) - -> Hash + -> Hash + -> Append -> Seq Scan on pagg_tab2_p2 Filter: (y > 10) -(23 rows) + -> Seq Scan on pagg_tab2_p3 + Filter: (y > 10) +(18 rows) SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a LEFT JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20 GROUP BY a.x, b.y ORDER BY 1, 2; x | y | count @@ -760,49 +756,33 @@ SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a LEFT JOI 18 | 18 | 100 (7 rows) --- FULL JOIN, with dummy relations on both sides, +-- FULL JOIN, with dummy relations on both sides, ideally -- should produce partial partitionwise aggregation plan as GROUP BY is on --- nullable columns +-- nullable columns. +-- But right now we are unable to do partitionwise join in this case. EXPLAIN (COSTS OFF) SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a FULL JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20 GROUP BY a.x, b.y ORDER BY 1, 2; - QUERY PLAN ------------------------------------------------------------------------------------ - Finalize GroupAggregate - Group Key: pagg_tab1_p1.x, y - -> Sort - Sort Key: pagg_tab1_p1.x, y - -> Append - -> Partial HashAggregate - Group Key: pagg_tab1_p1.x, y - -> Hash Full Join - Hash Cond: (pagg_tab1_p1.x = y) - Filter: ((pagg_tab1_p1.x > 5) OR (y < 20)) - -> Seq Scan on pagg_tab1_p1 - Filter: (x < 20) - -> Hash - -> Result - One-Time Filter: false - -> Partial HashAggregate - Group Key: pagg_tab1_p2.x, pagg_tab2_p2.y - -> Hash Full Join - Hash Cond: (pagg_tab1_p2.x = pagg_tab2_p2.y) - Filter: ((pagg_tab1_p2.x > 5) OR (pagg_tab2_p2.y < 20)) - -> Seq Scan on pagg_tab1_p2 - Filter: (x < 20) - -> Hash - -> Seq Scan on pagg_tab2_p2 - Filter: (y > 10) - -> Partial HashAggregate - Group Key: x, pagg_tab2_p3.y - -> Hash Full Join - Hash Cond: (pagg_tab2_p3.y = x) - Filter: ((x > 5) OR (pagg_tab2_p3.y < 20)) + QUERY PLAN +----------------------------------------------------------------------- + Sort + Sort Key: pagg_tab1_p1.x, pagg_tab2_p2.y + -> HashAggregate + Group Key: pagg_tab1_p1.x, pagg_tab2_p2.y + -> Hash Full Join + Hash Cond: (pagg_tab1_p1.x = pagg_tab2_p2.y) + Filter: ((pagg_tab1_p1.x > 5) OR (pagg_tab2_p2.y < 20)) + -> Append + -> Seq Scan on pagg_tab1_p1 + Filter: (x < 20) + -> Seq Scan on pagg_tab1_p2 + Filter: (x < 20) + -> Hash + -> Append + -> Seq Scan on pagg_tab2_p2 + Filter: (y > 10) -> Seq Scan on pagg_tab2_p3 Filter: (y > 10) - -> Hash - -> Result - One-Time Filter: false -(35 rows) +(18 rows) SELECT a.x, b.y, count(*) FROM (SELECT * FROM pagg_tab1 WHERE x < 20) a FULL JOIN (SELECT * FROM pagg_tab2 WHERE y > 10) b ON a.x = b.y WHERE a.x > 5 or b.y < 20 GROUP BY a.x, b.y ORDER BY 1, 2; x | y | count |