Lists: | pgsql-hackers |
---|
From: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-08-31 16:05:47 |
Message-ID: | CAJkzx4T5E-2cQe3dtv2R78dYFvz+in8PY7A8MArvLhs_pg75gg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
It seems like this patch is nearly finished. I fixed all the remaining
issues. I'm also asking
a confirmation of the test scenarios you want to see in the next
version of the patch.
> Hi,
>
> On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote:
> > Totally unasked for, here's a rebase of this patch series. I didn't do
> > anything other than rebasing to current master, solving a couple of very
> > trivial conflicts, fixing some whitespace complaints by git apply, and
> > running tests to verify everthing works.
> >
> > I don't foresee working on this at all, so if anyone is interested in
> > seeing this feature in, I encourage them to read and address
> > Horiguchi-san's feedback.
>
> Nor am I planning to do so, but I do think its a pretty important
> improvement.
>
>
Fixed
>
>
> > +/*
> > + * PQrecyclePipelinedCommand
> > + * Push a command queue entry onto the freelist. It must be a dangling entry
> > + * with null next pointer and not referenced by any other entry's next pointer.
> > + */
>
> Dangling sounds a bit like it's already freed.
>
>
Fixed
>
>
> > +/*
> > + * PQbatchSendQueue
> > + * End a batch submission by sending a protocol sync. The connection will
> > + * remain in batch mode and unavailable for new synchronous command execution
> > + * functions until all results from the batch are processed by the client.
>
> I feel like the reference to the protocol sync is a bit too low level
> for an external API. It should first document what the function does
> from a user's POV.
>
> I think it'd also be good to document whether / whether not queries can
> already have been sent before PQbatchSendQueue is called or not.
>
Fixed
>
>
>
> > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass != PGQUERY_SYNC)
> > + {
> > + /*
> > + * In an aborted batch we don't get anything from the server for each
> > + * result; we're just discarding input until we get to the next sync
> > + * from the server. The client needs to know its queries got aborted
> > + * so we create a fake PGresult to return immediately from
> > + * PQgetResult.
> > + */
> > + conn->result = PQmakeEmptyPGresult(conn,
> > + PGRES_BATCH_ABORTED);
> > + if (!conn->result)
> > + {
> > + printfPQExpBuffer(&conn->errorMessage,
> > + libpq_gettext("out of memory"));
> > + pqSaveErrorResult(conn);
> > + return 0;
>
> Is there any way an application can recover at this point? ISTM we'd be
> stuck in the previous asyncStatus, no?
>
conn->result is null when malloc(sizeof(PGresult)) returns null. It's
very unlikely unless
the server machine is out of memory, so the server will probably be
unresponsive anyway.
I'm leaving this as it is but if anyone has a solution simple to
implement I'll fix it.
>
>
> > +/* pqBatchFlush
> > + * In batch mode, data will be flushed only when the out buffer reaches the threshold value.
> > + * In non-batch mode, data will be flushed all the time.
> > + */
> > +static int
> > +pqBatchFlush(PGconn *conn)
> > +{
> > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >= OUTBUFFER_THRESHOLD))
> > + return(pqFlush(conn));
> > + return 0; /* Just to keep compiler quiet */
> > +}
>
> unnecessarily long line.
>
Fixed
>
> > +/*
> > + * Connection's outbuffer threshold is set to 64k as it is safe
> > + * in Windows as per comments in pqSendSome() API.
> > + */
> > +#define OUTBUFFER_THRESHOLD 65536
>
> I don't think the comment explains much. It's fine to send more than 64k
> with pqSendSome(), they'll just be send with separate pgsecure_write()
> invocations. And only on windows.
>
> It clearly makes sense to start sending out data at a certain
> granularity to avoid needing unnecessary amounts of memory, and to make
> more efficient use of latency / serer side compute.
>
> It's not implausible that 64k is the right amount for that, I just don't
> think the explanation above is good.
>
Fixed
> > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c b/src/test/modules/test_libpq/testlibpqbatch.c
> > new file mode 100644
> > index 0000000000..4d6ba266e5
> > --- /dev/null
> > +++ b/src/test/modules/test_libpq/testlibpqbatch.c
> > @@ -0,0 +1,1456 @@
> > +/*
> > + * src/test/modules/test_libpq/testlibpqbatch.c
> > + *
> > + *
> > + * testlibpqbatch.c
> > + * Test of batch execution functionality
> > + */
> > +
> > +#ifdef WIN32
> > +#include <windows.h>
> > +#endif
>
> ISTM that this shouldn't be needed in a test program like this?
> Shouldn't libpq abstract all of this away?
>
Fixed.
>
> > +static void
> > +simple_batch(PGconn *conn)
> > +{
>
> ISTM that all or at least several of these should include tests of
> transactional behaviour with pipelining (e.g. using both implicit and
> explicit transactions inside a single batch, using transactions across
> batches, multiple explicit transactions inside a batch).
>
@Andres, just to make sure I understood, here is the test scenarios I'll add:
Implicit and explicit multiple transactions:
start batch:
create_table
insert X
begin transaction
insert X
commit transaction
begin transaction
insert Y
commit transaction
end batch
Transaction across batches:
start batch:
create_table
begin transaction
insert X
end batch
start batch:
insert Y
commit transaction
end batch
>
>
> > + PGresult *res = NULL;
> > + const char *dummy_params[1] = {"1"};
> > + Oid dummy_param_oids[1] = {INT4OID};
> > +
> > + fprintf(stderr, "simple batch... ");
> > + fflush(stderr);
>
> Why do we need fflush()? IMO that shouldn't be needed in a use like
> this? Especially not on stderr, which ought to be unbuffered?
>
Removed.
>
> > + /*
> > + * Enter batch mode and dispatch a set of operations, which we'll then
> > + * process the results of as they come in.
> > + *
> > + * For a simple case we should be able to do this without interim
> > + * processing of results since our out buffer will give us enough slush to
> > + * work with and we won't block on sending. So blocking mode is fine.
> > + */
> > + if (PQisnonblocking(conn))
> > + {
> > + fprintf(stderr, "Expected blocking connection mode\n");
> > + goto fail;
> > + }
>
> Perhaps worth adding a helper for this?
>
> #define EXPECT(condition, explanation) \
>
Fixed (and saved 600 lines :).
Once I get your confirmation of the test scenarios, I'll implement
them and share another patch.
Attachment | Content-Type | Size |
---|---|---|
libpq-batch-v19.patch | application/x-patch | 86.3 KB |
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-09-04 21:26:03 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2020-Aug-31, Matthieu Garrigues wrote:
> It seems like this patch is nearly finished. I fixed all the remaining
> issues. I'm also asking a confirmation of the test scenarios you want
> to see in the next version of the patch.
Hmm, apparently nobody noticed that this patch is not registered in the
current commitfest.
Since it was submitted ahead of the deadline, I'm going to add it there.
(The patch has also undergone a lot of review already; it's certainly
not a newcomer.)
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
---|---|
To: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-09-21 12:38:07 |
Message-ID: | CADK3HHKA+m1_A_Da7z=qXwNdLzPsrJSSievwMCSzkhdShXfJcg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Alvaro,
On Fri, 4 Sep 2020 at 17:26, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:
> On 2020-Aug-31, Matthieu Garrigues wrote:
>
> > It seems like this patch is nearly finished. I fixed all the remaining
> > issues. I'm also asking a confirmation of the test scenarios you want
> > to see in the next version of the patch.
>
> Hmm, apparently nobody noticed that this patch is not registered in the
> current commitfest.
>
> Since it was submitted ahead of the deadline, I'm going to add it there.
> (The patch has also undergone a lot of review already; it's certainly
> not a newcomer.)
>
I am looking for this in the commitfest and can't find it. However there is
an old commitfest entry
https://commitfest.postgresql.org/13/1024/
Do you have the link for the new one ?
Dave Cramer
www.postgres.rocks
>
>
>
From: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
---|---|
To: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-09-21 13:08:49 |
Message-ID: | CADK3HHLN5ASE+fOThJtGGJr_86kgOM54-H2npLZqHEQbO7fLSQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Dave Cramer
www.postgres.rocks
On Mon, 31 Aug 2020 at 11:46, Matthieu Garrigues <
matthieu(dot)garrigues(at)gmail(dot)com> wrote:
> Hi,
>
> It seems like this patch is nearly finished. I fixed all the remaining
> issues. I'm also asking
> a confirmation of the test scenarios you want to see in the next
> version of the patch.
>
> > Hi,
> >
> > On 2020-07-10 19:01:49 -0400, Alvaro Herrera wrote:
> > > Totally unasked for, here's a rebase of this patch series. I didn't do
> > > anything other than rebasing to current master, solving a couple of
> very
> > > trivial conflicts, fixing some whitespace complaints by git apply, and
> > > running tests to verify everthing works.
> > >
> > > I don't foresee working on this at all, so if anyone is interested in
> > > seeing this feature in, I encourage them to read and address
> > > Horiguchi-san's feedback.
> >
> > Nor am I planning to do so, but I do think its a pretty important
> > improvement.
> >
> >
> Fixed
>
> >
> >
> > > +/*
> > > + * PQrecyclePipelinedCommand
> > > + * Push a command queue entry onto the freelist. It must be a
> dangling entry
> > > + * with null next pointer and not referenced by any other entry's
> next pointer.
> > > + */
> >
> > Dangling sounds a bit like it's already freed.
> >
> >
> Fixed
>
> >
> >
> > > +/*
> > > + * PQbatchSendQueue
> > > + * End a batch submission by sending a protocol sync. The connection
> will
> > > + * remain in batch mode and unavailable for new synchronous command
> execution
> > > + * functions until all results from the batch are processed by the
> client.
> >
> > I feel like the reference to the protocol sync is a bit too low level
> > for an external API. It should first document what the function does
> > from a user's POV.
> >
> > I think it'd also be good to document whether / whether not queries can
> > already have been sent before PQbatchSendQueue is called or not.
> >
> Fixed
>
> >
> >
> >
> > > + if (conn->batch_status == PQBATCH_MODE_ABORTED && conn->queryclass
> != PGQUERY_SYNC)
> > > + {
> > > + /*
> > > + * In an aborted batch we don't get anything from the server for each
> > > + * result; we're just discarding input until we get to the next sync
> > > + * from the server. The client needs to know its queries got aborted
> > > + * so we create a fake PGresult to return immediately from
> > > + * PQgetResult.
> > > + */
> > > + conn->result = PQmakeEmptyPGresult(conn,
> > > + PGRES_BATCH_ABORTED);
> > > + if (!conn->result)
> > > + {
> > > + printfPQExpBuffer(&conn->errorMessage,
> > > + libpq_gettext("out of memory"));
> > > + pqSaveErrorResult(conn);
> > > + return 0;
> >
> > Is there any way an application can recover at this point? ISTM we'd be
> > stuck in the previous asyncStatus, no?
> >
>
> conn->result is null when malloc(sizeof(PGresult)) returns null. It's
> very unlikely unless
> the server machine is out of memory, so the server will probably be
> unresponsive anyway.
>
> I'm leaving this as it is but if anyone has a solution simple to
> implement I'll fix it.
>
> >
> >
> > > +/* pqBatchFlush
> > > + * In batch mode, data will be flushed only when the out buffer
> reaches the threshold value.
> > > + * In non-batch mode, data will be flushed all the time.
> > > + */
> > > +static int
> > > +pqBatchFlush(PGconn *conn)
> > > +{
> > > + if ((conn->batch_status == PQBATCH_MODE_OFF)||(conn->outCount >=
> OUTBUFFER_THRESHOLD))
> > > + return(pqFlush(conn));
> > > + return 0; /* Just to keep compiler quiet */
> > > +}
> >
> > unnecessarily long line.
> >
> Fixed
>
> >
> > > +/*
> > > + * Connection's outbuffer threshold is set to 64k as it is safe
> > > + * in Windows as per comments in pqSendSome() API.
> > > + */
> > > +#define OUTBUFFER_THRESHOLD 65536
> >
> > I don't think the comment explains much. It's fine to send more than 64k
> > with pqSendSome(), they'll just be send with separate pgsecure_write()
> > invocations. And only on windows.
> >
> > It clearly makes sense to start sending out data at a certain
> > granularity to avoid needing unnecessary amounts of memory, and to make
> > more efficient use of latency / serer side compute.
> >
> > It's not implausible that 64k is the right amount for that, I just don't
> > think the explanation above is good.
> >
>
> Fixed
>
> > > diff --git a/src/test/modules/test_libpq/testlibpqbatch.c
> b/src/test/modules/test_libpq/testlibpqbatch.c
> > > new file mode 100644
> > > index 0000000000..4d6ba266e5
> > > --- /dev/null
> > > +++ b/src/test/modules/test_libpq/testlibpqbatch.c
> > > @@ -0,0 +1,1456 @@
> > > +/*
> > > + * src/test/modules/test_libpq/testlibpqbatch.c
> > > + *
> > > + *
> > > + * testlibpqbatch.c
> > > + * Test of batch execution functionality
> > > + */
> > > +
> > > +#ifdef WIN32
> > > +#include <windows.h>
> > > +#endif
> >
> > ISTM that this shouldn't be needed in a test program like this?
> > Shouldn't libpq abstract all of this away?
> >
>
> Fixed.
>
> >
> > > +static void
> > > +simple_batch(PGconn *conn)
> > > +{
> >
> > ISTM that all or at least several of these should include tests of
> > transactional behaviour with pipelining (e.g. using both implicit and
> > explicit transactions inside a single batch, using transactions across
> > batches, multiple explicit transactions inside a batch).
> >
>
> @Andres, just to make sure I understood, here is the test scenarios I'll
> add:
>
> Implicit and explicit multiple transactions:
> start batch:
> create_table
> insert X
> begin transaction
> insert X
> commit transaction
> begin transaction
> insert Y
> commit transaction
> end batch
>
> Transaction across batches:
> start batch:
> create_table
> begin transaction
> insert X
> end batch
> start batch:
> insert Y
> commit transaction
> end batch
>
> >
> >
> > > + PGresult *res = NULL;
> > > + const char *dummy_params[1] = {"1"};
> > > + Oid dummy_param_oids[1] = {INT4OID};
> > > +
> > > + fprintf(stderr, "simple batch... ");
> > > + fflush(stderr);
> >
> > Why do we need fflush()? IMO that shouldn't be needed in a use like
> > this? Especially not on stderr, which ought to be unbuffered?
> >
> Removed.
>
> >
> > > + /*
> > > + * Enter batch mode and dispatch a set of operations, which we'll then
> > > + * process the results of as they come in.
> > > + *
> > > + * For a simple case we should be able to do this without interim
> > > + * processing of results since our out buffer will give us enough
> slush to
> > > + * work with and we won't block on sending. So blocking mode is fine.
> > > + */
> > > + if (PQisnonblocking(conn))
> > > + {
> > > + fprintf(stderr, "Expected blocking connection mode\n");
> > > + goto fail;
> > > + }
> >
> > Perhaps worth adding a helper for this?
> >
> > #define EXPECT(condition, explanation) \
> >
>
> Fixed (and saved 600 lines :).
>
>
>
> Once I get your confirmation of the test scenarios, I'll implement
> them and share another patch.
>
There was a comment upthread a while back that people should look at the
comments made in
https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp
by Horiguchi-San.
From what I can tell this has not been addressed. The one big thing is the
use of PQbatchProcessQueue vs just putting it in PQgetResult.
The argument is that adding PQbatchProcessQueue is unnecessary and just
adds another step. Looking at this, it seems like putting this inside
PQgetResult would get my vote as it leaves the interface unchanged.
Dave
From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-09-21 13:21:11 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2020-Sep-21, Dave Cramer wrote:
Hello Dave,
> I am looking for this in the commitfest and can't find it. However there is
> an old commitfest entry
>
> https://commitfest.postgresql.org/13/1024/
>
> Do you have the link for the new one ?
Here you go:
https://commitfest.postgresql.org/29/2724/
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
---|---|
To: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-09-21 13:39:23 |
Message-ID: | CADK3HH+3K3q=pOWjShDPSP8vgKBZ=wP2tSCi1fSk7nXQvvCg1A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, 21 Sep 2020 at 09:21, Matthieu Garrigues <
matthieu(dot)garrigues(at)gmail(dot)com> wrote:
> Matthieu Garrigues
>
> On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer <davecramer(at)postgres(dot)rocks>
> wrote:
> >>
> > There was a comment upthread a while back that people should look at the
> comments made in
> https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp
> by Horiguchi-San.
> >
> > From what I can tell this has not been addressed. The one big thing is
> the use of PQbatchProcessQueue vs just putting it in PQgetResult.
> >
> > The argument is that adding PQbatchProcessQueue is unnecessary and just
> adds another step. Looking at this, it seems like putting this inside
> PQgetResult would get my vote as it leaves the interface unchanged.
> >
>
> Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one
> thing: I'll keep PQgetResult returning null between the result of two
> batched query so the user
> can know which result comes from which query.
>
Fair enough.
There may be other things in his comments that need to be addressed. That
was the big one that stuck out for me.
Thanks for working on this!
Dave Cramer
www.postgres.rocks
From: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
---|---|
To: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-09-21 13:41:13 |
Message-ID: | CAJkzx4S+hwDpJm95HT8e-m3RXfNECgBKrdiR1pN6wu8B2Xy8cA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Matthieu Garrigues
On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer <davecramer(at)postgres(dot)rocks> wrote:
>>
> There was a comment upthread a while back that people should look at the comments made in https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp by Horiguchi-San.
>
> From what I can tell this has not been addressed. The one big thing is the use of PQbatchProcessQueue vs just putting it in PQgetResult.
>
> The argument is that adding PQbatchProcessQueue is unnecessary and just adds another step. Looking at this, it seems like putting this inside PQgetResult would get my vote as it leaves the interface unchanged.
>
Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one
thing: I'll keep PQgetResult returning null between the result of two
batched query so the user
can know which result comes from which query.
From: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
---|---|
To: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-09-21 14:22:51 |
Message-ID: | CAJkzx4R=j9wTA2SDh0MfnsH8Q8erpd2goq-07dPxWR5mnLo3BA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Sep 21, 2020 at 3:39 PM Dave Cramer <davecramer(at)postgres(dot)rocks> wrote:
>
>
>
> On Mon, 21 Sep 2020 at 09:21, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> wrote:
>>
>> Matthieu Garrigues
>>
>> On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer <davecramer(at)postgres(dot)rocks> wrote:
>> >>
>> > There was a comment upthread a while back that people should look at the comments made in https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp by Horiguchi-San.
>> >
>> > From what I can tell this has not been addressed. The one big thing is the use of PQbatchProcessQueue vs just putting it in PQgetResult.
>> >
>> > The argument is that adding PQbatchProcessQueue is unnecessary and just adds another step. Looking at this, it seems like putting this inside PQgetResult would get my vote as it leaves the interface unchanged.
>> >
>>
>> Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one
>> thing: I'll keep PQgetResult returning null between the result of two
>> batched query so the user
>> can know which result comes from which query.
>
>
> Fair enough.
>
> There may be other things in his comments that need to be addressed. That was the big one that stuck out for me.
>
> Thanks for working on this!
>
Yes I already addressed the other things in the v19 patch:
https://www.postgresql.org/message-id/flat/CAJkzx4T5E-2cQe3dtv2R78dYFvz+in8PY7A8MArvLhs_pg75gg(at)mail(dot)gmail(dot)com
From: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
---|---|
To: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-09-21 17:55:03 |
Message-ID: | CAJkzx4TWfkXBkh=KTonhWSXvmpBQV5EdL64j_KY9FMP_BmBwLA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi Dave,
I merged PQbatchProcessQueue into PQgetResult.
One first init call to PQbatchProcessQueue was also required in
PQsendQueue to have
PQgetResult ready to read the first batch query.
Tests and documentation are updated accordingly.
Matthieu Garrigues
On Mon, Sep 21, 2020 at 3:39 PM Dave Cramer <davecramer(at)postgres(dot)rocks> wrote:
>
>
>
> On Mon, 21 Sep 2020 at 09:21, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> wrote:
>>
>> Matthieu Garrigues
>>
>> On Mon, Sep 21, 2020 at 3:09 PM Dave Cramer <davecramer(at)postgres(dot)rocks> wrote:
>> >>
>> > There was a comment upthread a while back that people should look at the comments made in https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp by Horiguchi-San.
>> >
>> > From what I can tell this has not been addressed. The one big thing is the use of PQbatchProcessQueue vs just putting it in PQgetResult.
>> >
>> > The argument is that adding PQbatchProcessQueue is unnecessary and just adds another step. Looking at this, it seems like putting this inside PQgetResult would get my vote as it leaves the interface unchanged.
>> >
>>
>> Ok. I'll merge PQbatchProcessQueue into PQgetResult. But just one
>> thing: I'll keep PQgetResult returning null between the result of two
>> batched query so the user
>> can know which result comes from which query.
>
>
> Fair enough.
>
> There may be other things in his comments that need to be addressed. That was the big one that stuck out for me.
>
> Thanks for working on this!
>
>
> Dave Cramer
> www.postgres.rocks
Attachment | Content-Type | Size |
---|---|---|
libpq-batch-v20.patch | text/x-patch | 84.6 KB |
From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
Cc: | Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-10-01 04:35:21 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Sep 21, 2020 at 07:55:03PM +0200, Matthieu Garrigues wrote:
> I merged PQbatchProcessQueue into PQgetResult.
> One first init call to PQbatchProcessQueue was also required in
> PQsendQueue to have
> PQgetResult ready to read the first batch query.
>
> Tests and documentation are updated accordingly.
The documentation is failing to build, and the patch does not build
correctly on Windows. Could you address that?
--
Michael
From: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-10-01 06:41:27 |
Message-ID: | CAJkzx4S_5xXYqGV43BC_xPs=xV3kmtAzT+yncZ+C5KgATkeTUA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, Oct 1, 2020 at 6:35 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> The documentation is failing to build, and the patch does not build
> correctly on Windows. Could you address that?
> --
> Michael
Yes I'm on it.
--
Matthieu
From: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
---|---|
To: | Michael Paquier <michael(at)paquier(dot)xyz> |
Cc: | Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-10-01 08:13:44 |
Message-ID: | CAJkzx4THDPGJ8RmNs0tA0y8g-ET5UTxbD2PJ=Wsc8-LN5zLadQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
This patch fixes compilation on windows and compilation of the documentation.
Matthieu Garrigues
On Thu, Oct 1, 2020 at 8:41 AM Matthieu Garrigues
<matthieu(dot)garrigues(at)gmail(dot)com> wrote:
>
> On Thu, Oct 1, 2020 at 6:35 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> > The documentation is failing to build, and the patch does not build
> > correctly on Windows. Could you address that?
> > --
> > Michael
>
> Yes I'm on it.
>
> --
> Matthieu
Attachment | Content-Type | Size |
---|---|---|
libpq-batch-v21.patch | text/x-patch | 87.6 KB |
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-10-26 19:09:36 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I started reading this patch. As a starting point I decided to post an
updated copy (v22), wherein I reverted the overwriting of
src/include/port/linux.h with the win32.h contents (?) and the inclusion
of <windows.h> in libpq-fe.h. If those are needed, some different
solution will have to be found.
Trivial other changes (pgindent etc); nothing of substance. With the
PQtrace() patch I posted at [1] the added test program crashes -- I
don't know if the fault lies in this patch or the other one.
[1] https://postgr.es/m/[email protected]
Attachment | Content-Type | Size |
---|---|---|
v22-0001-libpq-batch.patch | text/x-diff | 86.7 KB |
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-02 15:45:25 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
In v23 I've gone over docs; discovered that PQgetResults docs were
missing the new values. Added those. No significant other changes yet.
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-02 15:57:52 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2020-Nov-02, Alvaro Herrera wrote:
> In v23 I've gone over docs; discovered that PQgetResults docs were
> missing the new values. Added those. No significant other changes yet.
Attachment | Content-Type | Size |
---|---|---|
v23-0001-libpq-batch.patch | text/x-diff | 87.5 KB |
From: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-03 13:20:03 |
Message-ID: | CADK3HHKZ5uagbFs2-VFyQU30+VOVZ30VdU-4UQkrSmFkT-tFbQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Alvaro,
On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> On 2020-Nov-02, Alvaro Herrera wrote:
>
> > In v23 I've gone over docs; discovered that PQgetResults docs were
> > missing the new values. Added those. No significant other changes yet.
>
>
>
Thanks for looking at this.
What else does it need to get it in shape to apply?
Dave Cramer
www.postgres.rocks
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-03 13:42:34 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi Dave,
On 2020-Nov-03, Dave Cramer wrote:
> On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
> > On 2020-Nov-02, Alvaro Herrera wrote:
> >
> > > In v23 I've gone over docs; discovered that PQgetResults docs were
> > > missing the new values. Added those. No significant other changes yet.
>
> Thanks for looking at this.
>
> What else does it need to get it in shape to apply?
I want to go over the code in depth to grok the design more fully.
It would definitely help if you (and others) could think about the API
being added: Does it fulfill the promises being made? Does it offer the
guarantees that real-world apps want to have? I'm not much of an
application writer myself -- particularly high-traffic apps that would
want to use this. As a driver author I would welcome your insight in
these questions.
From: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-03 15:56:31 |
Message-ID: | CADK3HHKHdNX188Ey5FX0GednszqH77mNa9ihX0ed-QKX8SEnUg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Tue, 3 Nov 2020 at 08:42, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> Hi Dave,
>
> On 2020-Nov-03, Dave Cramer wrote:
>
> > On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
> wrote:
> >
> > > On 2020-Nov-02, Alvaro Herrera wrote:
> > >
> > > > In v23 I've gone over docs; discovered that PQgetResults docs were
> > > > missing the new values. Added those. No significant other changes
> yet.
> >
> > Thanks for looking at this.
> >
> > What else does it need to get it in shape to apply?
>
> I want to go over the code in depth to grok the design more fully.
>
> It would definitely help if you (and others) could think about the API
> being added: Does it fulfill the promises being made? Does it offer the
> guarantees that real-world apps want to have? I'm not much of an
> application writer myself -- particularly high-traffic apps that would
> want to use this. As a driver author I would welcome your insight in
> these questions.
>
>
I'm sort of in the same boat as you. While I'm closer to the client. I
don't personally write that much client code.
I'd really like to hear from the users here.
Dave Cramer
www.postgres.rocks
From: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
---|---|
To: | Dave Cramer <davecramer(at)postgres(dot)rocks> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-03 16:44:59 |
Message-ID: | CAJkzx4RGBwQ34kbKSEi0yTYxvYpWRZA0mMvz3JB-dwG3XwZBYg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I implemented a C++ async HTTP server using this new batch mode and it
provides everything I needed to transparently batch sql requests.
It gives a performance boost between x2 and x3 on this benchmark:
https://www.techempower.com/benchmarks/#section=test&runid=3097dbae-5228-454c-ba2e-2055d3982790&hw=ph&test=query&a=2&f=zik0zj-zik0zj-zik0zj-zik0zj-zieepr-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj-zik0zj
I'll ask other users interested in this to review the API.
Matthieu Garrigues
On Tue, Nov 3, 2020 at 4:56 PM Dave Cramer <davecramer(at)postgres(dot)rocks> wrote:
>
>
>
> On Tue, 3 Nov 2020 at 08:42, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>>
>> Hi Dave,
>>
>> On 2020-Nov-03, Dave Cramer wrote:
>>
>> > On Mon, 2 Nov 2020 at 10:57, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>> >
>> > > On 2020-Nov-02, Alvaro Herrera wrote:
>> > >
>> > > > In v23 I've gone over docs; discovered that PQgetResults docs were
>> > > > missing the new values. Added those. No significant other changes yet.
>> >
>> > Thanks for looking at this.
>> >
>> > What else does it need to get it in shape to apply?
>>
>> I want to go over the code in depth to grok the design more fully.
>>
>> It would definitely help if you (and others) could think about the API
>> being added: Does it fulfill the promises being made? Does it offer the
>> guarantees that real-world apps want to have? I'm not much of an
>> application writer myself -- particularly high-traffic apps that would
>> want to use this. As a driver author I would welcome your insight in
>> these questions.
>>
>
> I'm sort of in the same boat as you. While I'm closer to the client. I don't personally write that much client code.
>
> I'd really like to hear from the users here.
>
>
> Dave Cramer
> www.postgres.rocks
From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-03 17:20:53 |
Message-ID: | CAKFQuwZTNgNwnHdxQnQnOpKfyX5KW6twgj23ZNUZByDVgmdi7w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Mon, Nov 2, 2020 at 8:58 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:
> On 2020-Nov-02, Alvaro Herrera wrote:
>
> > In v23 I've gone over docs; discovered that PQgetResults docs were
> > missing the new values. Added those. No significant other changes yet.
>
>
Just reading the documentation of this patch, haven't been following the
longer thread:
Given the caveats around blocking mode connections why not just require
non-blocking mode, in a similar fashion to how synchronous functions are
disallowed?
"Batched operations will be executed by the server in the order the client
sends them. The server will send the results in the order the statements
executed."
Maybe:
"The server executes statements, and returns results, in the order the
client sends them."
Using two sentences and relying on the user to mentally link the two "in
the order" descriptions together seems to add unnecessary cognitive load.
+ The client <link linkend="libpq-batch-interleave">interleaves result
+ processing</link> with sending batch queries, or for small batches may
+ process all results after sending the whole batch.
Suggest: "The client may choose to interleave result processing with
sending batch queries, or wait until the complete batch has been sent."
I would expect to process the results of a batch only after sending the
entire batch to the server. That I don't have to is informative but
knowing when I should avoid doing so, and why, is informative as well. To
the extreme while you can use batch mode and interleave if you just poll
getResult after every command you will make the whole batch thing
pointless. Directing the reader from here to the section "Interleaving
Result Processing and Query Dispatch" seems worth considering. The
dynamics of small sizes and sockets remains a bit unclear as to what will
break (if anything, or is it just process memory on the server) if
interleaving it not performed and sizes are large.
I would suggest placing commentary about "all transactions subsequent to a
failed transaction in a batch are ignored while previous completed
transactions are retained" in the "When to Use Batching". Something like
"Batching is less useful, and more complex, when a single batch contains
multiple transactions (see Error Handling)."
My imagined use case would be to open a batch, start a transaction, send
all of its components, end the transaction, end the batch, check for batch
failure and if it doesn't fail have the option to easily continue without
processing individual pgResults (or if it does fail, have the option to
extract the first error pgResult and continue, ignoring the rest, knowing
that the transaction as a whole was reverted and the batch unapplied).
I've never interfaced with libpq directly. Though given how the existing C
API works what is implemented here seems consistent.
The "queueing up queries into a pipeline to be executed as a batch on the
server" can be read as a client-side behavior where nothing is sent to the
server until the batch has been completed. Reading further it becomes
clear that all it basically is is a sever-side toggle that instructs the
server to continue processing incoming commands even while prior commands
have their results waiting to be ingested by the client.
Batch seems like the user-visible term to describe this feature. Pipeline
seems like an implementation detail that doesn't need to be mentioned in
the documentation - especially given that pipeline doesn't get a mentioned
beyond the first two paragraphs of the chapter and never without being
linked directly to "batch". I would probably leave the indexterm and have
a paragraph describing that batching is implemented using a query pipeline
so that people with the implementation detail on their mind can find this
chapter, but the prose for the user should just stick to batching.
Sorry, that all is a bit unfocused, but the documentation for the user of
the API could be cleaned up a bit and some more words spent on
what trade-offs are being made when using batching versus normal
command-response processing. That said, while I don't see all of this
purely a matter of style I'm also not seeing anything demonstrably wrong
with the documentation at the moment. Hopefully my perspective helps
though, and depending on what happens next I may try and make my thoughts
more concrete with an actual patch.
David J.
From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Dave Cramer <davecramer(at)postgres(dot)rocks>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-11 22:46:30 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On 2020-11-03 10:42:34 -0300, Alvaro Herrera wrote:
> It would definitely help if you (and others) could think about the API
> being added: Does it fulfill the promises being made? Does it offer the
> guarantees that real-world apps want to have? I'm not much of an
> application writer myself -- particularly high-traffic apps that would
> want to use this.
Somewhere earlier in this thread there was a patch with support for
batching in pgbench. I think it'd be good to refresh that. Both because
it shows at least some real-world-lite usage of the feature and because
we need a way to stress it to see whether it has unnecessary
bottlenecks.
Greetings,
Andres Freund
From: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-12 09:46:54 |
Message-ID: | CAJkzx4S6YTKAJrasb4hSaM8tNHNrgUk-bxB4Da7sT++D9Zq7cg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi David,
Thanks for the feedback. I did rework a bit the doc based on your
remarks. Here is the v24 patch.
Matthieu Garrigues
On Tue, Nov 3, 2020 at 6:21 PM David G. Johnston
<david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> On Mon, Nov 2, 2020 at 8:58 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>>
>> On 2020-Nov-02, Alvaro Herrera wrote:
>>
>> > In v23 I've gone over docs; discovered that PQgetResults docs were
>> > missing the new values. Added those. No significant other changes yet.
>>
>
> Just reading the documentation of this patch, haven't been following the longer thread:
>
> Given the caveats around blocking mode connections why not just require non-blocking mode, in a similar fashion to how synchronous functions are disallowed?
>
> "Batched operations will be executed by the server in the order the client
> sends them. The server will send the results in the order the statements
> executed."
>
> Maybe:
>
> "The server executes statements, and returns results, in the order the client sends them."
>
> Using two sentences and relying on the user to mentally link the two "in the order" descriptions together seems to add unnecessary cognitive load.
>
> + The client <link linkend="libpq-batch-interleave">interleaves result
> + processing</link> with sending batch queries, or for small batches may
> + process all results after sending the whole batch.
>
> Suggest: "The client may choose to interleave result processing with sending batch queries, or wait until the complete batch has been sent."
>
> I would expect to process the results of a batch only after sending the entire batch to the server. That I don't have to is informative but knowing when I should avoid doing so, and why, is informative as well. To the extreme while you can use batch mode and interleave if you just poll getResult after every command you will make the whole batch thing pointless. Directing the reader from here to the section "Interleaving Result Processing and Query Dispatch" seems worth considering. The dynamics of small sizes and sockets remains a bit unclear as to what will break (if anything, or is it just process memory on the server) if interleaving it not performed and sizes are large.
>
> I would suggest placing commentary about "all transactions subsequent to a failed transaction in a batch are ignored while previous completed transactions are retained" in the "When to Use Batching". Something like "Batching is less useful, and more complex, when a single batch contains multiple transactions (see Error Handling)."
>
> My imagined use case would be to open a batch, start a transaction, send all of its components, end the transaction, end the batch, check for batch failure and if it doesn't fail have the option to easily continue without processing individual pgResults (or if it does fail, have the option to extract the first error pgResult and continue, ignoring the rest, knowing that the transaction as a whole was reverted and the batch unapplied). I've never interfaced with libpq directly. Though given how the existing C API works what is implemented here seems consistent.
>
> The "queueing up queries into a pipeline to be executed as a batch on the server" can be read as a client-side behavior where nothing is sent to the server until the batch has been completed. Reading further it becomes clear that all it basically is is a sever-side toggle that instructs the server to continue processing incoming commands even while prior commands have their results waiting to be ingested by the client.
>
> Batch seems like the user-visible term to describe this feature. Pipeline seems like an implementation detail that doesn't need to be mentioned in the documentation - especially given that pipeline doesn't get a mentioned beyond the first two paragraphs of the chapter and never without being linked directly to "batch". I would probably leave the indexterm and have a paragraph describing that batching is implemented using a query pipeline so that people with the implementation detail on their mind can find this chapter, but the prose for the user should just stick to batching.
>
> Sorry, that all is a bit unfocused, but the documentation for the user of the API could be cleaned up a bit and some more words spent on what trade-offs are being made when using batching versus normal command-response processing. That said, while I don't see all of this purely a matter of style I'm also not seeing anything demonstrably wrong with the documentation at the moment. Hopefully my perspective helps though, and depending on what happens next I may try and make my thoughts more concrete with an actual patch.
>
> David J.
>
Attachment | Content-Type | Size |
---|---|---|
v24-libpq-batch.patch | text/x-patch | 86.1 KB |
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-12 13:40:43 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
(Adding previous reviewers to CC)
On 2020-Nov-03, David G. Johnston wrote:
> Given the caveats around blocking mode connections why not just require
> non-blocking mode, in a similar fashion to how synchronous functions are
> disallowed?
This is a very good question. Why indeed? Does anybody have a good
answer to this? If not, I propose we just require that non-blocking
mode is in use in order for batch mode to be used.
I've been doing a review pass over this patch and have an updated
version, which I intend to share later today (after I fix what appears
to be a misunderstanding in the "singlerow" test in testlibpqbatch.c)
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-14 00:37:53 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Here's a v25.
I made a few more changes to the docs per David's suggestions; I also
reordered the sections quite a bit. It's now like this:
* Batch Mode
* Using Batch Mode
* Issuing Queries
* Processing Results
* Error Handling
* Interleaving Result Processing and Query Dispatch
* Ending Batch Mode
* Functions Associated with Batch Mode
* When to Use Batching
To me as a reader, this makes more sense, but if you disagree, I think
we should discuss further changes. (For example, maybe we should move
the "Functions" section at the end?) The original had "When to Use
Batching" at the start, but it seemed to me that the points it's making
are not as critical as understanding what it is.
I reworked the test program to better fit the TAP model; I found that if
one test mecha failed in whatever way, the connection would be in a
weird state and cause the next test to also fail. I changed so that it
runs one test and exits; then the t/001_libpq_async.pl file (hmm, need
to rename to 001_batch.pl I guess) calls it once for each test.
I adapted the test code to our code style. I also removed the "timings"
stuff; I think that's something better left to pgbench.
(I haven't looked at Daniel's pgbench stuff yet, but I will do that
next.)
While looking at how the tests worked, I gave a hard stare at the new
libpq code and cleaned it up also. There's a lot of minor changes, but
nothing truly substantial. I moved the code around a lot, to keep
things where grouped together they belong.
I'm not 100% clear on things like PGconn->batch_status and how
PGconn->asyncStatus works. Currently everything seems to work
correctly, but I'm worried that because we add new status values to
asyncStatus, some existing code might not be doing everything correctly
(for example when changing to/from ASYNC_BUSY in some cases, are we 100%
we shouldn't be changing to ASYNC_QUEUED?)
While looking this over I noticed a thread from 2014[1] where Matt
Newell tried to implement this stuff and apparently the main review
comment he got before abandoning the patch was that the user would like
a way to access the query that corresponded to each result. The current
patch set does not address that need; the approach to this problem is
that it's on the application's head to keep track of this. Honestly I
don't understand why it would be otherwise ... I'm not sure that it
makes sense to expect that the application is stupid enough that it
doesn't keep track in which order it sent things, but bright enough to
keep pointers to the queries it sent (??). So this seems okay to me.
But added Heikki and Claudio to CC because of that old thread.
[1] https://postgr.es/m/2059418.jZQvL3lH90@obsidian
Attachment | Content-Type | Size |
---|---|---|
v25-libpq-batch.patch | text/x-diff | 85.7 KB |
From: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
---|---|
To: | "Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>,"Matthieu Garrigues" <matthieu(dot)garrigues(at)gmail(dot)com>,"Michael Paquier" <michael(at)paquier(dot)xyz>,"Dave Cramer" <davecramer(at)postgres(dot)rocks>,"PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>,"Craig Ringer" <craig(at)2ndquadrant(dot)com>,"Vaishnavi Prabakaran" <vaishnaviprabakaran(at)gmail(dot)com>,"Andres Freund" <andres(at)anarazel(dot)de>,"Michael Paquier" <michael(dot)paquier(at)gmail(dot)com>,"Claudio Freire" <klaussfreire(at)gmail(dot)com>,"Heikki Linnakangas" <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-14 14:42:50 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Alvaro Herrera wrote:
> I adapted the test code to our code style. I also removed the "timings"
> stuff; I think that's something better left to pgbench.
>
> (I haven't looked at Daniel's pgbench stuff yet, but I will do that
> next.)
The patch I posted in [1] was pretty simple, but at the time, query
results were always discarded. Now that pgbench can instantiate
variables from query results, a script can do:
select 1 as var \gset
select :var;
This kind of sequence wouldn't work in batch mode since it
sends queries before getting results of previous queries.
So maybe \gset should be rejected when inside a batch section.
Or alternatively pgbench should collect results before a variable is
reinjected into a query, thereby stalling the pipeline.
To do this only when necessary, it would have to track read-write
dependencies among variables, which seems overly complicated though.
[1]
https://www.postgresql.org/message-id/[email protected]
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite
From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-16 23:29:19 |
Message-ID: | CAKFQuwb-CZ_uz-CcxOQskKor_b8J+uHYDgWqYbzaL-YHYAGVDQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Fri, Nov 13, 2020 at 5:38 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:
> Here's a v25.
>
> I made a few more changes to the docs per David's suggestions; I also
> reordered the sections quite a bit. It's now like this:
> * Batch Mode
> * Using Batch Mode
> * Issuing Queries
> * Processing Results
> * Error Handling
> * Interleaving Result Processing and Query Dispatch
> * Ending Batch Mode
> * Functions Associated with Batch Mode
> * When to Use Batching
>
Thanks!
I like the new flow and changes. I've attached a patch that covers some
missing commas and typos along with a few parts that made me pause.
The impact of memory came out of nowhere under the non-blocking mode
commentary. I took a stab at incorporating it more broadly.
The "are error conditions" might be a well-known phrasing to those using
libpq but that sentence reads odd to me. I did try to make the example
listing flow a bit better and added a needed comma.
Tried to clean up a few phrasings after that. The error handling part I'm
not especially happy with but I think it's closer and more useful than just
"emitted during error handling" - it gets emitted upon error, handling is a
client concern.
Seems odd to say the new feature was introduced in v14.0, the .0 seems ok
to imply. I didn't actually fix it in the attached but "the PostgreSQL 14
version of libpq" is going to become outdated quickly, better just to drop
it.
"The batch API was introduced in PostgreSQL 14, but clients can use batches
on servers supporting v3 of the extended query protocol, potentially going
as far back as version 7.4."
David J.
Attachment | Content-Type | Size |
---|---|---|
v25-01-libpq-batch-dgj-doc-suggestions.patch | application/octet-stream | 7.3 KB |
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-18 17:51:34 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2020-Nov-14, Daniel Verite wrote:
> The patch I posted in [1] was pretty simple, but at the time, query
> results were always discarded. Now that pgbench can instantiate
> variables from query results, a script can do:
> select 1 as var \gset
> select :var;
> This kind of sequence wouldn't work in batch mode since it
> sends queries before getting results of previous queries.
>
> So maybe \gset should be rejected when inside a batch section.
Hah.
Hacking pgbench extensively is beyond what I'm willing to do for this
feature at this time. Making \gset rejected in a batch section sounds
simple enough and supports \beginbatch et al sufficiently to compare
performance, so I'm OK with a patch that does that. That'd be a small
extension to your previous patch, if I understand correctly.
If you or others want to send patches to extend batch support with
read-write tracking for variables, feel free, but I hereby declare that
I'm not taking immediate responsibility for getting them committed.
From: | "Andres Freund" <andres(at)anarazel(dot)de> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-18 23:17:36 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi
On Wed, Nov 18, 2020, at 09:51, Alvaro Herrera wrote:
> On 2020-Nov-14, Daniel Verite wrote:
>
> > The patch I posted in [1] was pretty simple, but at the time, query
> > results were always discarded. Now that pgbench can instantiate
> > variables from query results, a script can do:
> > select 1 as var \gset
> > select :var;
> > This kind of sequence wouldn't work in batch mode since it
> > sends queries before getting results of previous queries.
> >
> > So maybe \gset should be rejected when inside a batch section.
>
> Hah.
>
> Hacking pgbench extensively is beyond what I'm willing to do for this
> feature at this time. Making \gset rejected in a batch section sounds
> simple enough and supports \beginbatch et al sufficiently to compare
> performance, so I'm OK with a patch that does that. That'd be a small
> extension to your previous patch, if I understand correctly.
>
> If you or others want to send patches to extend batch support with
> read-write tracking for variables, feel free, but I hereby declare that
> I'm not taking immediate responsibility for getting them committed.
I think minimal support is entirely sufficient initially.
Andres
From: | "Daniel Verite" <daniel(at)manitou-mail(dot)org> |
---|---|
To: | "Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>,"Matthieu Garrigues" <matthieu(dot)garrigues(at)gmail(dot)com>,"Michael Paquier" <michael(at)paquier(dot)xyz>,"Dave Cramer" <davecramer(at)postgres(dot)rocks>,"PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>,"Craig Ringer" <craig(at)2ndquadrant(dot)com>,"Vaishnavi Prabakaran" <vaishnaviprabakaran(at)gmail(dot)com>,"Andres Freund" <andres(at)anarazel(dot)de>,"Michael Paquier" <michael(dot)paquier(at)gmail(dot)com>,"Claudio Freire" <klaussfreire(at)gmail(dot)com>,"Heikki Linnakangas" <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-23 21:58:34 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
Here's a new version with the pgbench support included.
Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: https://www.manitou-mail.org
Twitter: @DanielVerite
Attachment | Content-Type | Size |
---|---|---|
v26-libpq-batch.patch | text/plain | 93.4 KB |
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2020-11-23 22:15:39 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2020-Nov-23, Daniel Verite wrote:
> Hi,
>
> Here's a new version with the pgbench support included.
Thanks, incorporated into my local copy.
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-01-21 23:39:30 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Thanks David Johnston and Daniel Vérité, I have incorporated your
changes into this patch, which is now v26. Also, it's been rebased on
current sources.
I've been using the new PQtrace() stuff to verify the behavior of the
new feature. It's not perfect, but at least it doesn't crash
immediately as it did when I tried a few weeks ago. There are
imperfections that I think are due to bugs in the PQtrace
implementation, not in this patch.
--
Álvaro Herrera 39°49'30"S 73°17'W
"El conflicto es el camino real hacia la unión"
Attachment | Content-Type | Size |
---|---|---|
v26-libpq-batch.patch | text/x-diff | 92.7 KB |
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-01-21 23:43:26 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
As you can see in an XXX comment in the libpq test program, the current
implementation has the behavior that PQgetResult() returns NULL after a
batch is finished and has reported PGRES_BATCH_END. I don't know if
there's a hard reason to do that, but I'd like to supress it because it
seems weird and out of place.
--
Álvaro Herrera 39°49'30"S 73°17'W
From: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-01-22 00:14:55 |
Message-ID: | CALNJ-vRrGUW=Uj0b5ZtdV_6=pV6Ge0pxBvfLrYxjvdhPPdKZmg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
+ commandFailed(st, "SQL", "\\gset and \\aset are not
allowed in a batch section");
It seems '\\gset or \\aset is not ' would correspond to the check more
closely.
+ if (my_command->argc != 1)
+ syntax_error(source, lineno, my_command->first_line,
my_command->argv[0],
It is possible that my_command->argc == 0 (where my_command->argv[0]
shouldn't be accessed) ?
+ appendPQExpBufferStr(&conn->errorMessage,
+ libpq_gettext("cannot queue commands
during COPY\n"));
+ return false;
+ break;
Is the break necessary ? Similar comment for pqBatchProcessQueue().
+int
+PQexitBatchMode(PGconn *conn)
Since either 0 or 1 is returned, maybe change the return type to bool.
Also, the function operates on PGconn - should the function be static
(pqBatchProcessQueue is) ?
Cheers
On Thu, Jan 21, 2021 at 3:39 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:
> Thanks David Johnston and Daniel Vérité, I have incorporated your
> changes into this patch, which is now v26. Also, it's been rebased on
> current sources.
>
> I've been using the new PQtrace() stuff to verify the behavior of the
> new feature. It's not perfect, but at least it doesn't crash
> immediately as it did when I tried a few weeks ago. There are
> imperfections that I think are due to bugs in the PQtrace
> implementation, not in this patch.
>
> --
> Álvaro Herrera 39°49'30"S 73°17'W
> "El conflicto es el camino real hacia la unión"
>
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-10 23:51:47 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2021-Jan-21, Alvaro Herrera wrote:
> As you can see in an XXX comment in the libpq test program, the current
> implementation has the behavior that PQgetResult() returns NULL after a
> batch is finished and has reported PGRES_BATCH_END. I don't know if
> there's a hard reason to do that, but I'd like to supress it because it
> seems weird and out of place.
Hello Craig, a question for you since this API is your devising. I've
been looking at changing the way this works to prevent those NULL
returns from PQgetResult. That is, instead of having what seems like a
"NULL separator" of query results, you'd just get the PGRES_BATCH_END to
signify a batch end (not a NULL result after the BATCH_END); and the
normal PGRES_COMMAND_OK or PGRES_TUPLES_OK etc when the result of a
command has been sent. It's not working yet so I'm not sending an
updated patch, but I wanted to know if you had a rationale for including
this NULL return "separator" or was it just a convenience because of how
the code grew together.
Such a decision has obvious consequences for the test program (which
right now expects that PQgetResult() returns NULL at each step); and
naturally for libpq's internal state machine that controls how it all
works.
Thanks,
--
Álvaro Herrera 39°49'30"S 73°17'W
From: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-16 01:19:27 |
Message-ID: | CAGRY4nzf20aWm_VkAOK2QYa-H7A4wbLQHzzt9VjzTwo=4CU5_Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Thu, 11 Feb 2021 at 07:51, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:
> On 2021-Jan-21, Alvaro Herrera wrote:
>
> > As you can see in an XXX comment in the libpq test program, the current
> > implementation has the behavior that PQgetResult() returns NULL after a
> > batch is finished and has reported PGRES_BATCH_END. I don't know if
> > there's a hard reason to do that, but I'd like to supress it because it
> > seems weird and out of place.
>
> Hello Craig, a question for you since this API is your devising. I've
> been looking at changing the way this works to prevent those NULL
> returns from PQgetResult. That is, instead of having what seems like a
> "NULL separator" of query results, you'd just get the PGRES_BATCH_END to
> signify a batch end (not a NULL result after the BATCH_END); and the
> normal PGRES_COMMAND_OK or PGRES_TUPLES_OK etc when the result of a
> command has been sent. It's not working yet so I'm not sending an
> updated patch, but I wanted to know if you had a rationale for including
> this NULL return "separator" or was it just a convenience because of how
> the code grew together.
>
The existing API for libpq actually specifies[1] that you should call
PQgetResult() until it returns NULL:
> After successfully calling PQsendQuery, call PQgetResult one or more
times to obtain the results. PQsendQuery cannot be called again (on the
same connection) until PQgetResult has returned a null pointer, indicating
that the command is done.
Similarly, in single-row mode, the existing API specifies that you should
call PQgetResult() until it returns NULL.
Also, IIRC the protocol already permits multiple result sets to be
returned, and the caller cannot safely assume that a single PQsendQuery()
will produce only a single result set. (I really should write a test
extension that exercises that and how libpq reacts to it.)
That's why I went for the NULL return. I think. It's been a while now so
it's a bit fuzzy.
I would definitely like an API that does not rely on testing for a NULL
return. Especially since NULL return can be ambiguous in the context of
row-at-a-time mode. New explicit enumerations for PGresult would make a lot
more sense.
So +1 from me for the general idea. I need to look at the patch as it has
evolved soon too.
Remember that the original patch I submitted for this was a 1-day weekend
hack and proof of concept to show that libpq could be modified to support
query pipelining (and thus batching too), so I could illustrate the
performance benefits that could be attained by doing so. I'd been aware of
the benefits and the protocol's ability to support it for some time thanks
to working on PgJDBC, but couldn't get anyone interested without some C
code to demonstrate it, so I wrote some. I am not going to argue that the
API I added for it is ideal in any way, and happy to see improvements.
The only change I would very strongly object to would be anything that
turned this into a *batch* mode without query-pipelining support. If you
have to queue all the queries up in advance then send them as a batch and
wait for all the results, you miss out on a lot of the potential
round-trip-time optimisations and you add initial latency. So long as there
is a way to "send A", "send B", "send C", "read results from A", "send D",
and there's a way for the application to associate some kind of state (an
application specific id or index, a pointer to an application query-queue
struct, whatever) so it can match queries to results ... then I'm happy.
From: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-16 01:38:56 |
Message-ID: | CAGRY4nxO2JafmwJMzwzHuQC_3sXNw+nKs5XQAHPZQd8Gy3SXmQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Tue, 16 Feb 2021 at 09:19, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
wrote:
>
> So long as there is a way to "send A", "send B", "send C", "read results
> from A", "send D", and there's a way for the application to associate some
> kind of state (an application specific id or index, a pointer to an
> application query-queue struct, whatever) so it can match queries to
> results ... then I'm happy.
>
FWIW I'm also thinking of revising the docs to mostly use the term
"pipeline" instead of "batch". Use "pipelining and batching" in the chapter
topic, and mention "batch" in the index, and add a para that explains how
to run batches on top of pipelining, but otherwise use the term "pipeline"
not "batch".
That's because pipelining isn't actually batching, and using it as a naïve
batch interface will get you in trouble. If you PQsendQuery(...) a long
list of queries without consuming results, you'll block unless you're in
libpq's nonblocking-send mode. You'll then be deadlocked because the server
can't send results until you consume some (tx buffer full) and can't
consume queries until it can send some results, but you can't consume
results because you're blocked on a send that'll never end.
An actual batch interface where you can bind and send a long list of
queries might be worthwhile, but should be addressed separately, and it'd
be confusing if this pipelining interface claimed the term "batch". I'm not
convinced enough application developers actually code directly against
libpq for it to be worth creating a pretty batch interface where you can
submit (say) an array of struct PQbatchEntry { const char * query, int
nparams, ... } to a PQsendBatch(...) and let libpq handle the socket I/O
for you. But if someone wants to add one later it'll be easier if we don't
use the term "batch" for the pipelining feature.
I'll submit a reworded patch in a bit.
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-16 16:27:50 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2021-Feb-16, Craig Ringer wrote:
> FWIW I'm also thinking of revising the docs to mostly use the term
> "pipeline" instead of "batch". Use "pipelining and batching" in the chapter
> topic, and mention "batch" in the index, and add a para that explains how
> to run batches on top of pipelining, but otherwise use the term "pipeline"
> not "batch".
Hmm, this is a good point. It means I have a lot of API renaming to do.
I'll get on it now and come back with a proposal.
--
Álvaro Herrera Valdivia, Chile
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-16 23:13:50 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Here's a new version, where I've renamed everything to "pipeline". I
think the docs could use some additional tweaks now in order to make a
coherent story on pipeline mode, how it can be used in a batched
fashion, etc.
Here's the renames I applied. It's mostly mechanical, except
PQbatchSendQueue is now PQsendPipeline:
PQBatchStatus -> PGpipelineStatus (enum)
PQBATCH_MODE_OFF -> PQ_PIPELINE_OFF
PQBATCH_MODE_ON -> PQ_PIPELINE_ON
PQBATCH_MODE_ABORTED -> PQ_PIPELINE_ABORTED
PQbatchStatus -> PQpipelineStatus (function)
PQenterBatchMode -> PQenterPipelineMode
PQexitBatchMode -> PQexitPipelineMode
PQbatchSendQueue -> PQsendPipeline
PGRES_BATCH_END -> PGRES_PIPELINE_END
PGRES_BATCH_ABORTED -> PGRES_PIPELINE_ABORTED
Also, PQbatchStatus(conn) returns enum PGpipelineStatus (it previously
returned int).
I'm tempted to rename PGASYNC_QUEUED to PGASYNC_PIPELINE_IDLE (not sure
if PGASYNC_PIPELINE_READY fits better with the existing one).
In pgbench, I changed the metacommands to be \startpipeline and
\endpipeline. There's a failing Assert() there which I commented out;
needs fixed.
--
Álvaro Herrera 39°49'30"S 73°17'W
Attachment | Content-Type | Size |
---|---|---|
v27-libpq-pipeline.patch | text/x-diff | 95.4 KB |
From: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-16 23:45:08 |
Message-ID: | CALNJ-vTk+Bhtn0gjSS3KO7U+E34uDUfFbXZ6VRexXE-KKrnRNw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
+ if (querymode == QUERY_SIMPLE)
+ {
+ commandFailed(st, "startpipeline", "cannot use pipeline mode
with the simple query protocol");
+ st->state = CSTATE_ABORTED;
+ return CSTATE_ABORTED;
I wonder why the st->state is only assigned for this if block. The state is
not set for other cases where CSTATE_ABORTED is returned.
Should PQ_PIPELINE_OFF be returned for the following case ?
+PQpipelineStatus(const PGconn *conn)
+{
+ if (!conn)
+ return false;
Cheers
On Tue, Feb 16, 2021 at 3:14 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:
> Here's a new version, where I've renamed everything to "pipeline". I
> think the docs could use some additional tweaks now in order to make a
> coherent story on pipeline mode, how it can be used in a batched
> fashion, etc.
>
> Here's the renames I applied. It's mostly mechanical, except
> PQbatchSendQueue is now PQsendPipeline:
>
> PQBatchStatus -> PGpipelineStatus (enum)
> PQBATCH_MODE_OFF -> PQ_PIPELINE_OFF
> PQBATCH_MODE_ON -> PQ_PIPELINE_ON
> PQBATCH_MODE_ABORTED -> PQ_PIPELINE_ABORTED
> PQbatchStatus -> PQpipelineStatus (function)
> PQenterBatchMode -> PQenterPipelineMode
> PQexitBatchMode -> PQexitPipelineMode
> PQbatchSendQueue -> PQsendPipeline
> PGRES_BATCH_END -> PGRES_PIPELINE_END
> PGRES_BATCH_ABORTED -> PGRES_PIPELINE_ABORTED
>
> Also, PQbatchStatus(conn) returns enum PGpipelineStatus (it previously
> returned int).
>
> I'm tempted to rename PGASYNC_QUEUED to PGASYNC_PIPELINE_IDLE (not sure
> if PGASYNC_PIPELINE_READY fits better with the existing one).
>
>
> In pgbench, I changed the metacommands to be \startpipeline and
> \endpipeline. There's a failing Assert() there which I commented out;
> needs fixed.
>
> --
> Álvaro Herrera 39°49'30"S 73°17'W
>
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
Cc: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-19 18:35:09 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hello, thanks for looking at this patch.
On 2021-Feb-16, Zhihong Yu wrote:
> + if (querymode == QUERY_SIMPLE)
> + {
> + commandFailed(st, "startpipeline", "cannot use pipeline mode
> with the simple query protocol");
> + st->state = CSTATE_ABORTED;
> + return CSTATE_ABORTED;
>
> I wonder why the st->state is only assigned for this if block. The state is
> not set for other cases where CSTATE_ABORTED is returned.
Yeah, that's a simple oversight. We don't need to set st->state,
because the caller sets it to the value we return.
> Should PQ_PIPELINE_OFF be returned for the following case ?
>
> +PQpipelineStatus(const PGconn *conn)
> +{
> + if (!conn)
> + return false;
Yep.
Thanks
--
Álvaro Herrera 39°49'30"S 73°17'W
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-19 18:43:23 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2021-Jan-21, Zhihong Yu wrote:
> It seems '\\gset or \\aset is not ' would correspond to the check more
> closely.
>
> + if (my_command->argc != 1)
> + syntax_error(source, lineno, my_command->first_line,
> my_command->argv[0],
>
> It is possible that my_command->argc == 0 (where my_command->argv[0]
> shouldn't be accessed) ?
No -- process_backslash_command reads the word and sets it as argv[0].
> + appendPQExpBufferStr(&conn->errorMessage,
> + libpq_gettext("cannot queue commands
> during COPY\n"));
> + return false;
> + break;
>
> Is the break necessary ? Similar comment for pqBatchProcessQueue().
Not necessary. I removed them.
> +int
> +PQexitBatchMode(PGconn *conn)
>
> Since either 0 or 1 is returned, maybe change the return type to bool.
I was kinda tempted to do that, but in reality a lot of libpq's API is
defined like that -- to return 0 (failure)/1 (success) as ints, not
bools. For example see PQsendQuery(). I'm not inclined to do different
for these new functions. (One curious case is PQsetvalue, which returns
int, and is documented as "returns zero for failure" and then uses
"return false").
> Also, the function operates on PGconn - should the function be static
> (pqBatchProcessQueue is) ?
I don't understand this suggestion. How would an application call it,
if we make it static?
Thanks
--
Álvaro Herrera Valdivia, Chile
"Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad; jugar
al alfarero y ver qué formas se pueden sacar del otro" (C. Halloway en
La Feria de las Tinieblas, R. Bradbury)
From: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-19 20:08:15 |
Message-ID: | CALNJ-vQMeFYb==v5ySLy_oNqoHJ_pt2+9cKyj1pzy_NJKU7H9w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
+static int pqBatchProcessQueue(PGconn *conn);
I was suggesting changing:
+int
+PQexitBatchMode(PGconn *conn)
to:
+static int
+PQexitBatchMode(PGconn *conn)
Cheers
On Fri, Feb 19, 2021 at 10:43 AM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:
> On 2021-Jan-21, Zhihong Yu wrote:
>
> > It seems '\\gset or \\aset is not ' would correspond to the check more
> > closely.
> >
> > + if (my_command->argc != 1)
> > + syntax_error(source, lineno, my_command->first_line,
> > my_command->argv[0],
> >
> > It is possible that my_command->argc == 0 (where my_command->argv[0]
> > shouldn't be accessed) ?
>
> No -- process_backslash_command reads the word and sets it as argv[0].
>
> > + appendPQExpBufferStr(&conn->errorMessage,
> > + libpq_gettext("cannot queue commands
> > during COPY\n"));
> > + return false;
> > + break;
> >
> > Is the break necessary ? Similar comment for pqBatchProcessQueue().
>
> Not necessary. I removed them.
>
> > +int
> > +PQexitBatchMode(PGconn *conn)
> >
> > Since either 0 or 1 is returned, maybe change the return type to bool.
>
> I was kinda tempted to do that, but in reality a lot of libpq's API is
> defined like that -- to return 0 (failure)/1 (success) as ints, not
> bools. For example see PQsendQuery(). I'm not inclined to do different
> for these new functions. (One curious case is PQsetvalue, which returns
> int, and is documented as "returns zero for failure" and then uses
> "return false").
>
> > Also, the function operates on PGconn - should the function be static
> > (pqBatchProcessQueue is) ?
>
> I don't understand this suggestion. How would an application call it,
> if we make it static?
>
> Thanks
>
> --
> Álvaro Herrera Valdivia, Chile
> "Cómo ponemos nuestros dedos en la arcilla del otro. Eso es la amistad;
> jugar
> al alfarero y ver qué formas se pueden sacar del otro" (C. Halloway en
> La Feria de las Tinieblas, R. Bradbury)
>
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-19 20:46:51 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
On 2021-Feb-19, Zhihong Yu wrote:
> Hi,
> +static int pqBatchProcessQueue(PGconn *conn);
>
> I was suggesting changing:
>
> +int
> +PQexitBatchMode(PGconn *conn)
>
> to:
>
> +static int
> +PQexitBatchMode(PGconn *conn)
I understand that, but what I'm saying is that it doesn't work.
pqBatchProcessQueue can be static because it's only used internally in
libpq, but PQexitBatchMode is supposed to be called by the client
application.
--
Álvaro Herrera Valdivia, Chile
"No necesitamos banderas
No reconocemos fronteras" (Jorge González)
From: | Zhihong Yu <zyu(at)yugabyte(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Dave Cramer <davecramer(at)postgres(dot)rocks>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Vaishnavi Prabakaran <vaishnaviprabakaran(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-19 21:08:42 |
Message-ID: | CALNJ-vQR7c0bO9yCKTg4VL3ktfvpjoVsDR10sayDsjy9bsjEZg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
Hi,
Thanks for the response.
On Fri, Feb 19, 2021 at 12:46 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:
> Hi,
>
> On 2021-Feb-19, Zhihong Yu wrote:
>
> > Hi,
> > +static int pqBatchProcessQueue(PGconn *conn);
> >
> > I was suggesting changing:
> >
> > +int
> > +PQexitBatchMode(PGconn *conn)
> >
> > to:
> >
> > +static int
> > +PQexitBatchMode(PGconn *conn)
>
> I understand that, but what I'm saying is that it doesn't work.
> pqBatchProcessQueue can be static because it's only used internally in
> libpq, but PQexitBatchMode is supposed to be called by the client
> application.
>
> --
> Álvaro Herrera Valdivia, Chile
> "No necesitamos banderas
> No reconocemos fronteras" (Jorge González)
>
From: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-20 00:23:28 |
Message-ID: | CAGRY4nxT5v3Q+BD9F9=6dGNz6EnQL-BTP-AYJTxg+wqOXd9PgA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, 17 Feb 2021, 07:13 Alvaro Herrera, <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> Here's a new version, where I've renamed everything to "pipeline".
Wow. Thanks.
I
> think the docs could use some additional tweaks now in order to make a
> coherent story on pipeline mode, how it can be used in a batched
> fashion, etc.
>
I'll do that soon and send an update.
>
>
From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-02-20 01:36:55 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2021-Feb-20, Craig Ringer wrote:
> On Wed, 17 Feb 2021, 07:13 Alvaro Herrera, <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
> > I
> > think the docs could use some additional tweaks now in order to make a
> > coherent story on pipeline mode, how it can be used in a batched
> > fashion, etc.
>
> I'll do that soon and send an update.
I started to do that, but was sidetracked having a look at error
handling while checking one of the claims.
So now it starts with this:
<sect3 id="libpq-pipeline-sending">
<title>Issuing Queries</title>
<para>
After entering pipeline mode, the application dispatches requests using
<xref linkend="libpq-PQsendQueryParams"/>,
or its prepared-query sibling
<xref linkend="libpq-PQsendQueryPrepared"/>.
These requests are queued on the client-side until flushed to the server;
this occurs when <xref linkend="libpq-PQsendPipeline"/> is used to
establish a synchronization point in the pipeline,
or when <xref linkend="libpq-PQflush"/> is called.
The functions <xref linkend="libpq-PQsendPrepare"/>,
<xref linkend="libpq-PQsendDescribePrepared"/>, and
<xref linkend="libpq-PQsendDescribePortal"/> also work in pipeline mode.
Result processing is described below.
</para>
<para>
The server executes statements, and returns results, in the order the
client sends them. The server will begin executing the commands in the
pipeline immediately, not waiting for the end of the pipeline.
If any statement encounters an error, the server aborts the current
transaction and skips processing commands in the pipeline until the
next synchronization point established by <function>PQsendPipeline</function>.
(This remains true even if the commands in the pipeline would rollback
the transaction.)
Query processing resumes after the synchronization point.
</para>
(Note I changed the wording that "the pipeline is ended by
PQsendPipeline" to "a synchronization point is established". Is this
easily understandable? On re-reading it, I'm not sure it's really an
improvement.)
BTW we don't explain why doesn't PQsendQuery work (to wit: because it
uses "simple" query protocol and thus doesn't require the Sync message).
I think we should either document that, or change things so that
PQsendQuery no longer uses a 'Q' message when in pipeline mode; instead
use extended query protocol. I don't see why we'd force people to use
PQsendQueryParams when not necessary.
BTW, in my experimentation, the sequence of PGresult that you get when
handling a pipeline with errors don't make a lot of sense. I'll spend
some more time on it.
While at it, as for the PGresult sequence and NULL returns: I think for
PQexec and maybe PQsendQuery, it makes sense to loop until PQgetResult
returns NULL, because you never know how many results are you going to
get. But for PQsendQueryParams, this is no longer true, because you
can't send multiple queries in one query string. So the only way to get
multiple results, is by using single-row mode. But that already has its
own protocol for multiple results, namely to get a stream of
PGRES_SINGLE_TUPLE terminated by a zero-rows PGRES_TUPLES_OK. So I'm
not sure there is a strong need for the mandatory NULL result.
--
Álvaro Herrera 39°49'30"S 73°17'W
From: | "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com> |
---|---|
To: | 'Alvaro Herrera' <alvherre(at)alvh(dot)no-ip(dot)org>, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
Cc: | Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: PATCH: Batch/pipelining support for libpq |
Date: | 2021-03-03 07:07:59 |
Message-ID: | OSBPR01MB23415F455B2532A4E83975B3EF989@OSBPR01MB2341.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, Feb 17, 2021 8:14 AM (JST), Alvaro Herrera wrote:
Hi Alvaro,
> Here's a new version, where I've renamed everything to "pipeline". I think the
> docs could use some additional tweaks now in order to make a coherent
> story on pipeline mode, how it can be used in a batched fashion, etc.
I tried applying this patch to test it on top of Iwata-san's libpq trace log [1].
In my environment, the compiler complains.
It seems that in libpqwalreceiver.c: libpqrcv_exec()
the switch for PQresultStatus needs to handle the
cases for PGRES_PIPELINE_END and PGRES_PIPELINE_ABORTED too.
switch (PQresultStatus(pgres))
{
...
Case PGRES_PIPELINE_ABORTED:
...
Case PGRES_PIPELINE_ END:
Regards,
Kirk Jamison
From: | 'Alvaro Herrera' <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com> |
Cc: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-03-03 16:34:30 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2021-Mar-03, k(dot)jamison(at)fujitsu(dot)com wrote:
> I tried applying this patch to test it on top of Iwata-san's libpq trace log [1].
> In my environment, the compiler complains.
> It seems that in libpqwalreceiver.c: libpqrcv_exec()
> the switch for PQresultStatus needs to handle the
> cases for PGRES_PIPELINE_END and PGRES_PIPELINE_ABORTED too.
Yeah, I noticed this too and decided to handle these cases as errors.
Here's v28, which includes that fix, and also gets rid of the
PGASYNC_QUEUED state; I instead made the rest of the code work using the
existing PGASYNC_IDLE state. This allows us get rid of a few cases
where we had to report "internal error: IDLE state unexpected" in a few
cases, and was rather pointless. With the current formulation, both
pipeline mode and normal mode share pretty much all state.
Also, I've renamed PGRES_PIPELINE_END to PGRES_PIPELINE_SYNC, because
that's closer to what it means: that result state by no means that
pipeline mode has ended. It only means that you called PQsendPipeline()
so the server gives you a Sync message.
I'm much more comfortable with this version, so I'm marking the patch as
Ready for Committer in case anybody else wants to look at this before I
push it.
However: on errors, I think it's a bit odd that you get a NULL from
PQgetResult after getting PGRES_PIPELINE_ABORTED. Maybe I should
suppress that. I'm no longer wrestling with the NULL after
PGRES_PIPELINE_END however, because that's not critical and you can not
ask for it and things work fine.
(Also, the test pipeline.c program in src/test/modules/test_libpq has a
new "transaction" mode that is not exercised by the TAP program; I'll
plug that in before commit.)
--
Álvaro Herrera Valdivia, Chile
Attachment | Content-Type | Size |
---|---|---|
v28-libpq-pipeline.patch | text/x-diff | 100.9 KB |
From: | 'Alvaro Herrera' <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com> |
Cc: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-03-03 23:42:41 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2021-Mar-03, 'Alvaro Herrera' wrote:
> I'm much more comfortable with this version, so I'm marking the patch as
> Ready for Committer in case anybody else wants to look at this before I
> push it.
Actually, I just noticed a pretty serious problem, which is that when we
report an error, we don't set "conn->errQuery" to the current query but
to the *previous* query, leading to non-sensical error reports like
$ ./pipeline pipeline_abort
aborted pipeline... ERROR: no existe la función no_such_function(integer)
LINE 1: INSERT INTO pq_pipeline_demo(itemno) VALUES ($1);
^
HINT: Ninguna función coincide en el nombre y tipos de argumentos. Puede ser necesario agregar conversión explícita de tipos.
ok
(Sorry about the Spanish there, but the gist of the problem is that
we're positioning the error cursor on a query that succeeded prior to
the query that raised the error.)
This should obviously not occur. I'm trying to figure out how to repair
it and not break everything again ...
--
Álvaro Herrera Valdivia, Chile
From: | 'Alvaro Herrera' <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com> |
Cc: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-03-04 00:24:29 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2021-Mar-03, 'Alvaro Herrera' wrote:
> This should obviously not occur. I'm trying to figure out how to repair
> it and not break everything again ...
I think trying to set up the connection state so that the next query
appears in conn->last_query prior to PQgetResult being called again
leads to worse breakage. The simplest fix seems to make fe-protocol3.c
aware that in this case, the query is in conn->cmd_queue_head instead,
as in the attached patch.
--
Álvaro Herrera 39°49'30"S 73°17'W
Attachment | Content-Type | Size |
---|---|---|
error-fix.patch | text/x-diff | 1.1 KB |
From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | k(dot)jamison(at)fujitsu(dot)com, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-03-04 00:45:02 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
I'm proposing some minor changes.
--
Justin
Attachment | Content-Type | Size |
---|---|---|
0002-doc-review.notapxtch | text/x-diff | 12.2 KB |
From: | 'Alvaro Herrera' <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com> |
Cc: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-03-04 01:51:08 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2021-Mar-03, 'Alvaro Herrera' wrote:
> On 2021-Mar-03, 'Alvaro Herrera' wrote:
>
> > This should obviously not occur. I'm trying to figure out how to repair
> > it and not break everything again ...
>
> I think trying to set up the connection state so that the next query
> appears in conn->last_query prior to PQgetResult being called again
> leads to worse breakage. The simplest fix seems to make fe-protocol3.c
> aware that in this case, the query is in conn->cmd_queue_head instead,
> as in the attached patch.
I wonder if it would make sense to get rid of conn->last_query
completely and just rely always on conn->cmd_queue_head, where the
normal (non-pipeline) would use the first entry of the command queue.
That might end up being simpler than pipeline mode "pretending" to take
over ->last_query.
--
Álvaro Herrera 39°49'30"S 73°17'W
"La verdad no siempre es bonita, pero el hambre de ella sí"
From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
---|---|
To: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, k(dot)jamison(at)fujitsu(dot)com, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-03-08 22:37:16 |
Message-ID: | CAKFQuwasQOY2C50cvi2bMSH4U-m8Xp+Dngjz4qjK3042tdeLew@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On Wed, Mar 3, 2021 at 5:45 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> I'm proposing some minor changes.
>
>
Some additional tweaks/comments for the documentation with the edit
proposed edits:
(edit) + <function>PQresultStatus</function>, will report a
Remove the comma
(orig) + the failed operation are skipped entirely. The same behaviour
holds if the
We seem to use "behavior" more frequently
(edit) + From the client perspective, after
<function>PQresultStatus</function>
Possessive "client's perspective"
(edit) + its expected results queue. Based on available memory,
results from the
"its corresponding results queue" - to go along with this change:
- of the order in which it sent queries and the expected results.
+ of the order in which it sent queries, to associate them with their
+ corresponding results.
(orig)
+ pipeline mode. If the current statement isn't finished processing
+ or there are results pending for collection with
Needs a comma after processing.
"results pending for collection" reads oddly to me...not sure what would be
better though...
(edit)
+ <note>
+ <para>
+ The pipeline API was introduced in
<productname>PostgreSQL</productname> 14.
+ Pipeline mode is a client-side feature which doesn't require server
+ support, and works on any server that supports the v3 extended query
+ protocol.
+ </para>
+ </note>
This note seems like it should be placed either near the very beginning of
the feature or incorporated into the feature introduction.
(orig)
+ If any statement encounters an error, the server aborts the current
+(-) transaction and skips processing commands in the pipeline until the
+ next synchronization point established by
<function>PQsendPipeline</function>.
I dislike "skips" here - even if it doesn't execute a command it still will
place a result on the socket so that the client can have something to match
up with the queries it sent, correct?
+ transaction and creates a PGRES_PIPELINE_ABORTED result for all commands
in the pipeline until the
David J.
From: | 'Alvaro Herrera' <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | "k(dot)jamison(at)fujitsu(dot)com" <k(dot)jamison(at)fujitsu(dot)com> |
Cc: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Matthieu Garrigues <matthieu(dot)garrigues(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: PATCH: Batch/pipelining support for libpq |
Date: | 2021-03-08 23:02:39 |
Message-ID: | [email protected] |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Lists: | pgsql-hackers |
On 2021-Mar-03, 'Alvaro Herrera' wrote:
> I wonder if it would make sense to get rid of conn->last_query
> completely and just rely always on conn->cmd_queue_head, where the
> normal (non-pipeline) would use the first entry of the command queue.
> That might end up being simpler than pipeline mode "pretending" to take
> over ->last_query.
I experimented with this today and it appears to be a good change,
though I haven't been able to make everything work correctly yet.
--
Álvaro Herrera Valdivia, Chile
"Entristecido, Wutra (canción de Las Barreras)
echa a Freyr a rodar
y a nosotros al mar"