From: | Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> |
---|---|
To: | Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Asynchronous MergeAppend |
Date: | 2024-08-10 20:24:43 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi! Thank you for your work on this subject! I think this is a very
useful optimization)
While looking through your code, I noticed some points that I think
should be taken into account. Firstly, I noticed only two tests to
verify the functionality of this function and I think that this is not
enough.
Are you thinking about adding some tests with queries involving, for
example, join connections with different tables and unusual operators?
In addition, I have a question about testing your feature on a
benchmark. Are you going to do this?
On 17.07.2024 16:24, Alexander Pyhalov wrote:
> Hello.
>
> I'd like to make MergeAppend node Async-capable like Append node.
> Nowadays when planner chooses MergeAppend plan, asynchronous execution
> is not possible. With attached patches you can see plans like
>
> EXPLAIN (VERBOSE, COSTS OFF)
> SELECT * FROM async_pt WHERE b % 100 = 0 ORDER BY b, a;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------------
>
> Merge Append
> Sort Key: async_pt.b, async_pt.a
> -> Async Foreign Scan on public.async_p1 async_pt_1
> Output: async_pt_1.a, async_pt_1.b, async_pt_1.c
> Remote SQL: SELECT a, b, c FROM public.base_tbl1 WHERE (((b %
> 100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST
> -> Async Foreign Scan on public.async_p2 async_pt_2
> Output: async_pt_2.a, async_pt_2.b, async_pt_2.c
> Remote SQL: SELECT a, b, c FROM public.base_tbl2 WHERE (((b %
> 100) = 0)) ORDER BY b ASC NULLS LAST, a ASC NULLS LAST
>
> This can be quite profitable (in our test cases you can gain up to two
> times better speed with MergeAppend async execution on remote servers).
>
> Code for asynchronous execution in Merge Append was mostly borrowed
> from Append node.
>
> What significantly differs - in ExecMergeAppendAsyncGetNext() you must
> return tuple from the specified slot.
> Subplan number determines tuple slot where data should be retrieved
> to. When subplan is ready to provide some data,
> it's cached in ms_asyncresults. When we get tuple for subplan,
> specified in ExecMergeAppendAsyncGetNext(),
> ExecMergeAppendAsyncRequest() returns true and loop in
> ExecMergeAppendAsyncGetNext() ends. We can fetch data for
> subplans which either don't have cached result ready or have already
> returned them to the upper node. This
> flag is stored in ms_has_asyncresults. As we can get data for some
> subplan either earlier or after loop in ExecMergeAppendAsyncRequest(),
> we check this flag twice in this function.
> Unlike ExecAppendAsyncEventWait(), it seems
> ExecMergeAppendAsyncEventWait() doesn't need a timeout - as there's no
> need to get result
> from synchronous subplan if a tuple form async one was explicitly
> requested.
>
> Also we had to fix postgres_fdw to avoid directly looking at Append
> fields. Perhaps, accesors to Append fields look strange, but allows
> to avoid some code duplication. I suppose, duplication could be even
> less if we reworked async Append implementation, but so far I haven't
> tried to do this to avoid big diff from master.
>
> Also mark_async_capable() believes that path corresponds to plan. This
> can be not true when create_[merge_]append_plan() inserts sort node.
> In this case mark_async_capable() can treat Sort plan node as some
> other and crash, so there's a small fix for this.
I think you should add this explanation to the commit message because
without it it's hard to understand the full picture of how your code works.
--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Kirill Reshke | 2024-08-10 20:57:35 | Re: Vacuum statistics |
Previous Message | Andrei Zubkov | 2024-08-10 19:37:25 | Re: Vacuum statistics |