| Lists: | pgsql-hackers |
|---|
| From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
|---|---|
| To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Logical Replication - revisit `is_table_publication` function implementation |
| Date: | 2026-04-07 07:02:23 |
| Message-ID: | CAHut+Pti83yGaV5-DZU=AvJHxFDuoKW8_pjSedRham8SgZxLYA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
Hi, after confirming my understanding of pg_publication_rel [1], I
revisited some logical replication internal functions.
Specifically.
* The `is_table_publication` function is for checking if the
publication has a clause like "FOR TABLE t1".
* The `is_schema_publication` function is for checking if the
publication has a clause like "FOR TABLES IN SCHEMA s1".
Notice that neither of these ("FOR TABLE", "FOR TABLES IN SCHEMA")
clauses are possible simultaneously with "FOR ALL TABLES".
And we can readily discover if "FOR ALL TABLES" (aka `puballtables`)
is present from the pubform.
We can use this to optimise and simplify the implementations of the
`is_schema_publication` and `is_table_publication` functions.
PSA patch v1.
AFAICT, the result is:
- less code + simpler logic. e.g. is_table_publication does not check
'prexcept' anymore
- more efficient. e.g. skips unnecessary scanning when puballtables is true.
- more consistent. e.g., both functions are now almost identical.
Thoughts?
Kind Regards,
Peter Smith.
Fujitsu Australia
| Attachment | Content-Type | Size |
|---|---|---|
| v1-0001-rewrite-is_table_publication.patch | application/octet-stream | 5.1 KB |
| From: | vignesh C <vignesh21(at)gmail(dot)com> |
|---|---|
| To: | Peter Smith <smithpb2250(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Logical Replication - revisit `is_table_publication` function implementation |
| Date: | 2026-04-08 03:45:22 |
| Message-ID: | CALDaNm0nLdBKJVHVvvOnY_5mkVg20=OL18fdjA5+KZ3GhPB=TQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On Tue, 7 Apr 2026 at 12:32, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> Hi, after confirming my understanding of pg_publication_rel [1], I
> revisited some logical replication internal functions.
>
> Specifically.
> * The `is_table_publication` function is for checking if the
> publication has a clause like "FOR TABLE t1".
> * The `is_schema_publication` function is for checking if the
> publication has a clause like "FOR TABLES IN SCHEMA s1".
>
> Notice that neither of these ("FOR TABLE", "FOR TABLES IN SCHEMA")
> clauses are possible simultaneously with "FOR ALL TABLES".
>
> And we can readily discover if "FOR ALL TABLES" (aka `puballtables`)
> is present from the pubform.
>
> We can use this to optimise and simplify the implementations of the
> `is_schema_publication` and `is_table_publication` functions.
>
> PSA patch v1.
>
> AFAICT, the result is:
> - less code + simpler logic. e.g. is_table_publication does not check
> 'prexcept' anymore
> - more efficient. e.g. skips unnecessary scanning when puballtables is true.
> - more consistent. e.g., both functions are now almost identical.
>
> Thoughts?
I'm not sure if this additional check is sufficient in case of
is_schema_publication. Checking only puballtables can exclude FOR ALL
TABLES, but it still cannot distinguish regular table publications,
empty publications, or sequence publications. In all of those cases,
we still need to check pg_publication_namespace. And also why just
check for puballtables why not to check for puballsequences
+is_schema_publication(Form_pg_publication pubform)
{
Relation pubschsrel;
ScanKeyData scankey;
SysScanDesc scan;
HeapTuple tup;
- bool result = false;
+ bool result;
+
+ /* FOR TABLES IN SCHEMA cannot coexist with FOR ALL TABLES. */
+ if (pubform->puballtables)
+ return false;
Regards,
Vignesh
| From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
|---|---|
| To: | vignesh C <vignesh21(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Logical Replication - revisit `is_table_publication` function implementation |
| Date: | 2026-04-08 04:53:33 |
| Message-ID: | CAHut+Pv+a-7-NRrZv4v6RfaaUo5b21RXe0tGOu8CfKrxPjE=tw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On Wed, Apr 8, 2026 at 1:45 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Tue, 7 Apr 2026 at 12:32, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi, after confirming my understanding of pg_publication_rel [1], I
> > revisited some logical replication internal functions.
> >
> > Specifically.
> > * The `is_table_publication` function is for checking if the
> > publication has a clause like "FOR TABLE t1".
> > * The `is_schema_publication` function is for checking if the
> > publication has a clause like "FOR TABLES IN SCHEMA s1".
> >
> > Notice that neither of these ("FOR TABLE", "FOR TABLES IN SCHEMA")
> > clauses are possible simultaneously with "FOR ALL TABLES".
> >
> > And we can readily discover if "FOR ALL TABLES" (aka `puballtables`)
> > is present from the pubform.
> >
> > We can use this to optimise and simplify the implementations of the
> > `is_schema_publication` and `is_table_publication` functions.
> >
> > PSA patch v1.
> >
> > AFAICT, the result is:
> > - less code + simpler logic. e.g. is_table_publication does not check
> > 'prexcept' anymore
> > - more efficient. e.g. skips unnecessary scanning when puballtables is true.
> > - more consistent. e.g., both functions are now almost identical.
> >
> > Thoughts?
>
Hi Vignesh. Thanks for reviewing!
> I'm not sure if this additional check is sufficient in case of
> is_schema_publication. Checking only puballtables can exclude FOR ALL
> TABLES, but it still cannot distinguish regular table publications,
> empty publications, or sequence publications. In all of those cases,
> we still need to check pg_publication_namespace.
Yes, this condition is only an optimisation for FOR ALL TABLES, as the
comment says.
IMO, the overhead of 1 additional boolean check for cases where it
doesn't help is an insignificant trade-off for the savings when it can
return false.
> And also why just check for puballtables why not to check for puballsequences
I think function is_schema_publication() is unrelated to 'puballsequences'.
e.g. all the following will still need to check
pg_publication_namespace, regardless of the 'puballsequences' value.
ex1. CREATE PUBLICATION ... FOR ALL SEQUENCES;
ex2. CREATE PUBLICATION ... FOR ALL SEQUENCES, FOR TABLES IN SCHEMA s1;
ex3. CREATE PUBLICATION ... FOR TABLES IN SCHEMA s1;
======
Kind Regards,
Peter Smith.
Fujitsu Austalia
| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | Peter Smith <smithpb2250(at)gmail(dot)com> |
| Cc: | vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: Logical Replication - revisit `is_table_publication` function implementation |
| Date: | 2026-04-08 05:24:51 |
| Message-ID: | CAJpy0uAHe88hL5MX3q9tyGyx_gCKKHcWnmETXvXM2CqnE8jrmA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On Wed, Apr 8, 2026 at 10:24 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Wed, Apr 8, 2026 at 1:45 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> >
> > On Tue, 7 Apr 2026 at 12:32, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > >
> > > Hi, after confirming my understanding of pg_publication_rel [1], I
> > > revisited some logical replication internal functions.
> > >
> > > Specifically.
> > > * The `is_table_publication` function is for checking if the
> > > publication has a clause like "FOR TABLE t1".
> > > * The `is_schema_publication` function is for checking if the
> > > publication has a clause like "FOR TABLES IN SCHEMA s1".
> > >
> > > Notice that neither of these ("FOR TABLE", "FOR TABLES IN SCHEMA")
> > > clauses are possible simultaneously with "FOR ALL TABLES".
> > >
> > > And we can readily discover if "FOR ALL TABLES" (aka `puballtables`)
> > > is present from the pubform.
> > >
> > > We can use this to optimise and simplify the implementations of the
> > > `is_schema_publication` and `is_table_publication` functions.
> > >
> > > PSA patch v1.
> > >
> > > AFAICT, the result is:
> > > - less code + simpler logic. e.g. is_table_publication does not check
> > > 'prexcept' anymore
> > > - more efficient. e.g. skips unnecessary scanning when puballtables is true.
> > > - more consistent. e.g., both functions are now almost identical.
> > >
> > > Thoughts?
> >
>
> Hi Vignesh. Thanks for reviewing!
>
> > I'm not sure if this additional check is sufficient in case of
> > is_schema_publication. Checking only puballtables can exclude FOR ALL
> > TABLES, but it still cannot distinguish regular table publications,
> > empty publications, or sequence publications. In all of those cases,
> > we still need to check pg_publication_namespace.
>
> Yes, this condition is only an optimisation for FOR ALL TABLES, as the
> comment says.
>
> IMO, the overhead of 1 additional boolean check for cases where it
> doesn't help is an insignificant trade-off for the savings when it can
> return false.
>
> > And also why just check for puballtables why not to check for puballsequences
>
> I think function is_schema_publication() is unrelated to 'puballsequences'.
>
> e.g. all the following will still need to check
> pg_publication_namespace, regardless of the 'puballsequences' value.
>
> ex1. CREATE PUBLICATION ... FOR ALL SEQUENCES;
> ex2. CREATE PUBLICATION ... FOR ALL SEQUENCES, FOR TABLES IN SCHEMA s1;
> ex3. CREATE PUBLICATION ... FOR TABLES IN SCHEMA s1;
>
IIUC, we don't support mix of ALL SEQUENCES and TABLES IN SCHEMA s1.
So I could not understand your point, why FOR ALL SEQ still need to
check pg_publication_namespace?
thanks
Shveta
| From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Cc: | vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Logical Replication - revisit `is_table_publication` function implementation |
| Date: | 2026-04-08 06:04:45 |
| Message-ID: | CAHut+PtfHzxHFkHJWYoxOFpgpSH5HNAGmP5sSrwh1d+R0Ab-BQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On Wed, Apr 8, 2026 at 3:25 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Wed, Apr 8, 2026 at 10:24 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > On Wed, Apr 8, 2026 at 1:45 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > >
> > > On Tue, 7 Apr 2026 at 12:32, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > >
> > > > Hi, after confirming my understanding of pg_publication_rel [1], I
> > > > revisited some logical replication internal functions.
> > > >
> > > > Specifically.
> > > > * The `is_table_publication` function is for checking if the
> > > > publication has a clause like "FOR TABLE t1".
> > > > * The `is_schema_publication` function is for checking if the
> > > > publication has a clause like "FOR TABLES IN SCHEMA s1".
> > > >
> > > > Notice that neither of these ("FOR TABLE", "FOR TABLES IN SCHEMA")
> > > > clauses are possible simultaneously with "FOR ALL TABLES".
> > > >
> > > > And we can readily discover if "FOR ALL TABLES" (aka `puballtables`)
> > > > is present from the pubform.
> > > >
> > > > We can use this to optimise and simplify the implementations of the
> > > > `is_schema_publication` and `is_table_publication` functions.
> > > >
> > > > PSA patch v1.
> > > >
> > > > AFAICT, the result is:
> > > > - less code + simpler logic. e.g. is_table_publication does not check
> > > > 'prexcept' anymore
> > > > - more efficient. e.g. skips unnecessary scanning when puballtables is true.
> > > > - more consistent. e.g., both functions are now almost identical.
> > > >
> > > > Thoughts?
> > >
> >
> > Hi Vignesh. Thanks for reviewing!
> >
> > > I'm not sure if this additional check is sufficient in case of
> > > is_schema_publication. Checking only puballtables can exclude FOR ALL
> > > TABLES, but it still cannot distinguish regular table publications,
> > > empty publications, or sequence publications. In all of those cases,
> > > we still need to check pg_publication_namespace.
> >
> > Yes, this condition is only an optimisation for FOR ALL TABLES, as the
> > comment says.
> >
> > IMO, the overhead of 1 additional boolean check for cases where it
> > doesn't help is an insignificant trade-off for the savings when it can
> > return false.
> >
> > > And also why just check for puballtables why not to check for puballsequences
> >
> > I think function is_schema_publication() is unrelated to 'puballsequences'.
> >
> > e.g. all the following will still need to check
> > pg_publication_namespace, regardless of the 'puballsequences' value.
> >
> > ex1. CREATE PUBLICATION ... FOR ALL SEQUENCES;
> > ex2. CREATE PUBLICATION ... FOR ALL SEQUENCES, FOR TABLES IN SCHEMA s1;
> > ex3. CREATE PUBLICATION ... FOR TABLES IN SCHEMA s1;
> >
>
> IIUC, we don't support mix of ALL SEQUENCES and TABLES IN SCHEMA s1.
> So I could not understand your point, why FOR ALL SEQ still need to
> check pg_publication_namespace?
>
Oh! You are right.
(Sorry, Vignesh, I did not recognise that combination as unsupported).
I'll post a patch update to handle it.
======
Kind Regards,
Peter Smith.
Fujitsu Australia
| From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
|---|---|
| To: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Cc: | vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Logical Replication - revisit `is_table_publication` function implementation |
| Date: | 2026-04-08 06:27:49 |
| Message-ID: | CAHut+PsjrPOcs=ePWm+N-q=rdmhDeM-FE05gPgDoN647Jb3RaQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On Wed, Apr 8, 2026 at 4:04 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Wed, Apr 8, 2026 at 3:25 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> >
> > On Wed, Apr 8, 2026 at 10:24 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Apr 8, 2026 at 1:45 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > > >
...
> > > > And also why just check for puballtables why not to check for puballsequences
> > >
> > > I think function is_schema_publication() is unrelated to 'puballsequences'.
> > >
> > > e.g. all the following will still need to check
> > > pg_publication_namespace, regardless of the 'puballsequences' value.
> > >
> > > ex1. CREATE PUBLICATION ... FOR ALL SEQUENCES;
> > > ex2. CREATE PUBLICATION ... FOR ALL SEQUENCES, FOR TABLES IN SCHEMA s1;
> > > ex3. CREATE PUBLICATION ... FOR TABLES IN SCHEMA s1;
> > >
> >
> > IIUC, we don't support mix of ALL SEQUENCES and TABLES IN SCHEMA s1.
> > So I could not understand your point, why FOR ALL SEQ still need to
> > check pg_publication_namespace?
> >
>
> Oh! You are right.
>
> (Sorry, Vignesh, I did not recognise that combination as unsupported).
>
> I'll post a patch update to handle it.
>
PSA patch v2.
Same as before, but now also doing a quick return false from both
functions if `puballsequences` is true.
======
Kind Regards,
Peter Smith.
Fujitsu Australia
| Attachment | Content-Type | Size |
|---|---|---|
| v2-0001-rewrite-is_table_publication.patch | application/octet-stream | 5.2 KB |
| From: | shveta malik <shveta(dot)malik(at)gmail(dot)com> |
|---|---|
| To: | Peter Smith <smithpb2250(at)gmail(dot)com> |
| Cc: | vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, shveta malik <shveta(dot)malik(at)gmail(dot)com> |
| Subject: | Re: Logical Replication - revisit `is_table_publication` function implementation |
| Date: | 2026-04-09 04:09:35 |
| Message-ID: | CAJpy0uA9rj1xDDot3ydn_DZERH4Mc8D70syE3Xg8LWfS8zanZg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Lists: | pgsql-hackers |
On Wed, Apr 8, 2026 at 11:58 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
>
> On Wed, Apr 8, 2026 at 4:04 PM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > On Wed, Apr 8, 2026 at 3:25 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Apr 8, 2026 at 10:24 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > >
> > > > On Wed, Apr 8, 2026 at 1:45 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:
> > > > >
> ...
> > > > > And also why just check for puballtables why not to check for puballsequences
> > > >
> > > > I think function is_schema_publication() is unrelated to 'puballsequences'.
> > > >
> > > > e.g. all the following will still need to check
> > > > pg_publication_namespace, regardless of the 'puballsequences' value.
> > > >
> > > > ex1. CREATE PUBLICATION ... FOR ALL SEQUENCES;
> > > > ex2. CREATE PUBLICATION ... FOR ALL SEQUENCES, FOR TABLES IN SCHEMA s1;
> > > > ex3. CREATE PUBLICATION ... FOR TABLES IN SCHEMA s1;
> > > >
> > >
> > > IIUC, we don't support mix of ALL SEQUENCES and TABLES IN SCHEMA s1.
> > > So I could not understand your point, why FOR ALL SEQ still need to
> > > check pg_publication_namespace?
> > >
> >
> > Oh! You are right.
> >
> > (Sorry, Vignesh, I did not recognise that combination as unsupported).
> >
> > I'll post a patch update to handle it.
> >
>
> PSA patch v2.
>
> Same as before, but now also doing a quick return false from both
> functions if `puballsequences` is true.
>
Okay. I was trying to determine where this optimization would be beneficial.
In cases, where we attempt to add tables or schemas to an ALL TABLES
or ALL SEQUENCES publication, the operation will error out in
CheckAlterPublication() before is_table_publication() or
is_schema_publication() are even called. And in cases where we are
trying to add table or schema to a non ALL-TABLEs/SEQ pub, and we end
up invoking these functions, we still need to traverse pg_pub_rel.
The only scenario (as I understand it) that benefits from this change
is when we try to add EXCEPT to an ALL TABLES publication. In that
case, both of the concerned functions would not need to access
pg_pub_rel if the publication is already an ALL TABLES publication. So
this optimization helps in a positive (non-erroneous) case.
In cases where we need to throw an error (for example, adding EXCEPT
to a FOR TABLE publication), these checks would not provide any
benefit as we still need to traverse pg_pub_rel to see if it has any
valid tables or it is an emty publication (empty one is fine).
But since the optimization improves a valid, non-erroneous scenario,
IMO, it is good to include it. Let's see what others have to say on
this.
thanks
Shveta