git.postgresql.org Git - postgresql.git/log

docs: Improve the description of num_timed column in pg_stat_checkpointer.

The previous documentation stated that num_timed reflects the number of
scheduled checkpoints performed. However, checkpoints may be skipped
if the server has been idle, and num_timed counts both skipped and completed
checkpoints. This commit clarifies the description to make it clear that
the counter includes both skipped and completed checkpoints.

Back-patch to v17 where pg_stat_checkpointer was added.

Author: Fujii Masao
Reviewed-by: Alexander Korotkov
Discussion: https://postgr.es/m/9ea77f40-818d-4841-9dee-158ac8f6e690@oss.nttdata.com

Add some sanity checks in executor for query ID reporting

This commit adds three sanity checks in code paths of the executor where
it is possible to use hooks, checking that a query ID is reported in
pg_stat_activity if compute_query_id is enabled:
- ExecutorRun()
- ExecutorFinish()
- ExecutorEnd()

This causes the test in pg_stat_statements added in 933848d16dc9 to
complain immediately in ExecutorRun(). The idea behind this commit is
to help extensions to detect if they are missing query ID reports when a
query goes through the executor. Perhaps this will prove to be a bad
idea, but let's see where this experience goes in v18 and newer
versions.

Reviewed-by: Sami Imseih
Discussion: https://postgr.es/m/[email protected]

postgres_fdw: Extend postgres_fdw_get_connections to return user name.

This commit adds a "user_name" output column to
the postgres_fdw_get_connections function, returning the name
of the local user mapped to the foreign server for each connection.
If a public mapping is used, it returns "public."

This helps identify postgres_fdw connections more easily,
such as determining which connections are invalid, closed,
or used within the current transaction.

No extension version bump is needed, as commit c297a47c5f
already handled it for v18~.

Author: Hayato Kuroda
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/b492a935-6c7e-8c08-e485-3c1d64d7d10f@oss.nttdata.com

Extend PgStat_HashKey.objid from 4 to 8 bytes

This opens the possibility to define keys for more types of statistics
kinds in PgStat_HashKey, the first case being 8-byte query IDs for
statistics like pg_stat_statements.

This increases the size of PgStat_HashKey from 12 to 16 bytes, while
PgStatShared_HashEntry, entry stored in the dshash for pgstats, keeps
the same size due to alignment.

xl_xact_stats_item, that tracks the stats items to drop in commit WAL
records, is increased from 12 to 16 bytes. Note that individual chunks
in commit WAL records should be multiples of sizeof(int), hence 8-byte
object IDs are stored as two uint32, based on a suggestion from Heikki
Linnakangas.

While on it, the field of PgStat_HashKey is renamed from "objoid" to
"objid", as for some stats kinds this field does not refer to OIDs but
just IDs, like for replication slot stats.

This commit bumps the following format variables:
- PGSTAT_FILE_FORMAT_ID, as PgStat_HashKey is written to the stats file
for non-serialized stats kinds in the dshash table.
- XLOG_PAGE_MAGIC for the changes in xl_xact_stats_item.
- Catalog version, for the SQL function pg_stat_have_stats().

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/[email protected]

Don't enter parallel mode when holding interrupts.

Doing so caused the leader to hang in wait_event=ParallelFinish, which
required an immediate shutdown to resolve. Back-patch to v12 (all
supported versions).

Francesco Degrassi

Discussion: https://postgr.es/m/CAC-SaSzHUKT=vZJ8MPxYdC_URPfax+yoA1hKTcF4ROz_Q6z0_Q@mail.gmail.com

Add missing query ID reporting in extended query protocol

This commit adds query ID reports for two code paths when processing
extended query protocol messages:
- When receiving a bind message, setting it to the first Query retrieved
from a cached cache.
- When receiving an execute message, setting it to the first PlannedStmt
stored in a portal.

An advantage of this method is that this is able to cover all the types
of portals handled in the extended query protocol, particularly these
two when the report done in ExecutorStart() is not enough (neither is an
addition in ExecutorRun(), actually, for the second point):
- Multiple execute messages, with multiple ExecutorRun().
- Portal with execute/fetch messages, like a query with a RETURNING
clause and a fetch size that stores the tuples in a first execute
message going though ExecutorStart() and ExecuteRun(), followed by one
or more execute messages doing only fetches from the tuplestore created
in the first message.  This corresponds to the case where
execute_is_fetch is set, for example.

Note that the query ID reporting done in ExecutorStart() is still
necessary, as an EXECUTE requires it.  Query ID reporting is optimistic
and more calls to pgstat_report_query_id() don't matter as the first
report takes priority except if the report is forced.  The comment in
ExecutorStart() is adjusted to reflect better the reality with the
extended query protocol.

The test added in pg_stat_statements is a courtesy of Robert Haas.  This
uses psql's \bind metacommand, hence this part is backpatched down to
v16.

Reported-by: Kaido Vaikla, Erik Wienhold
Author: Sami Imseih
Reviewed-by: Jian He, Andrei Lepikhov, Michael Paquier
Discussion: https://postgr.es/m/CA+427g8DiW3aZ6pOpVgkPbqK97ouBdf18VLiHFesea2jUk3XoQ@mail.gmail.com
Discussion: https://postgr.es/m/CA+TgmoZxtnf_jZ=VqBSyaU8hfUkkwoJCJ6ufy4LGpXaunKrjrg@mail.gmail.com
Discussion: https://postgr.es/m/1391613709.939460.1684777418070@office.mailbox.org
Backpatch-through: 14

Allow ReadStream to be consumed as raw block numbers.

Commits 041b9680 and 6377e12a changed the interface of
scan_analyze_next_block() to take a ReadStream instead of a BlockNumber
and a BufferAccessStrategy, and to return a value to indicate when the
stream has run out of blocks.

This caused integration problems for at least one known extension that
uses specially encoded BlockNumber values that map to different
underlying storage, because acquire_sample_rows() sets up the stream so
that read_stream_next_buffer() reads blocks from the main fork of the
relation's SMgrRelation.

Provide read_stream_next_block(), as a way for such an extension to
access the stream of raw BlockNumbers directly and forward them to its
own ReadBuffer() calls after decoding, as it could in earlier releases.
The new function returns the BlockNumber and BufferAccessStrategy that
were previously passed directly to scan_analyze_next_block().
Alternatively, an extension could wrap the stream of BlockNumbers in
another ReadStream with a callback that performs any decoding required
to arrive at real storage manager BlockNumber values, so that it could
benefit from the I/O combining and concurrency provided by
read_stream.c.

Another class of table access method that does nothing in
scan_analyze_next_block() because it is not block-oriented could use
this function to control the number of block sampling loops. It could
match the previous behavior with "return read_stream_next_block(stream,
&bas) != InvalidBlockNumber".

Ongoing work is expected to provide better ANALYZE support for table
access methods that don't behave like heapam with respect to storage
blocks, but that will be for future releases.

Back-patch to 17.

Reported-by: Mats Kindahl <[email protected]>
Reviewed-by: Mats Kindahl <[email protected]>
Discussion: https://postgr.es/m/CA%2B14425%2BCcm07ocG97Fp%2BFrD9xUXqmBKFvecp0p%2BgV2YYR258Q%40mail.gmail.com

Repair pg_upgrade for identity sequences with non-default persistence.

Since we introduced unlogged sequences in v15, identity sequences
have defaulted to having the same persistence as their owning table.
However, it is possible to change that with ALTER SEQUENCE, and
pg_dump tries to preserve the logged-ness of sequences when it doesn't
match (as indeed it wouldn't for an unlogged table from before v15).

The fly in the ointment is that ALTER SEQUENCE SET [UN]LOGGED fails
in binary-upgrade mode, because it needs to assign a new relfilenode
which we cannot permit in that mode.  Thus, trying to pg_upgrade a
database containing a mismatching identity sequence failed.

To fix, add syntax to ADD/ALTER COLUMN GENERATED AS IDENTITY to allow
the sequence's persistence to be set correctly at creation, and use
that instead of ALTER SEQUENCE SET [UN]LOGGED in pg_dump.  (I tried to
make SET [UN]LOGGED work without any pg_dump modifications, but that
seems too fragile to be a desirable answer.  This way should be
markedly faster anyhow.)

In passing, document the previously-undocumented SEQUENCE NAME option
that pg_dump also relies on for identity sequences; I see no value
in trying to pretend it doesn't exist.

Per bug #18618 from Anthony Hsu.
Back-patch to v15 where we invented this stuff.

Discussion: https://postgr.es/m/18618-d4eb26d669ed110a@postgresql.org

Ensure standby promotion point in 043_wal_replay_wait.pl

This commit ensures standby will be promoted at least at the primary insert
LSN we have just observed. We use pg_switch_wal() to force the insert LSN
to be written then wait for standby to catchup.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/1d7b08f2-64a2-77fb-c666-c9a74c68eeda%40gmail.com

Minor cleanup related to pg_wal_replay_wait() procedure

* Rename $node_standby1 to $node_standby in 043_wal_replay_wait.pl as there
is only one standby.
* Remove useless debug printing in 043_wal_replay_wait.pl.
* Fix typo in one check description in 043_wal_replay_wait.pl.
* Fix some wording in comments and documentation.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/1d7b08f2-64a2-77fb-c666-c9a74c68eeda%40gmail.com
Reviewed-by: Alexander Lakhin

Avoid parallel nbtree index scan hangs with SAOPs.

Commit 5bf748b8, which enhanced nbtree ScalarArrayOp execution, made
parallel index scans work with the new design for arrays via explicit
scheduling of primitive index scans.  A backend that successfully
scheduled the scan's next primitive index scan saved its backend local
array keys in shared memory.  Any backend could pick up the scheduled
primitive scan within _bt_first.  This scheme decouples scheduling a
primitive scan from starting the scan (by performing another descent of
the index via a _bt_search call from _bt_first) to make things robust.

The scheme had a deadlock hazard, at least when the leader process
participated in the scan.  _bt_parallel_seize had a code path that made
backends that were not in an immediate position to start a scheduled
primitive index scan wait for some other backend to do so instead.
Under the right circumstances, the leader process could wait here
forever: the leader would wait for any other backend to start the
primitive scan, while every worker was busy waiting on the leader to
consume tuples from the scan's tuple queue.

To fix, don't wait for a scheduled primitive index scan to be started by
some other eligible backend from within _bt_parallel_seize (when the
calling backend isn't in a position to do so itself).  Return false
instead, while recording that the scan has a scheduled primitive index
scan in backend local state.  This leaves the backend in the same state
as the existing case where a backend schedules (or tries to schedule)
another primitive index scan from within _bt_advance_array_keys, before
calling _bt_parallel_seize.  _bt_parallel_seize already handles that
case by returning false without waiting, and without unsetting the
backend local state.  Leaving the backend in this state enables it to
start a previously scheduled primitive index scan once it gets back to
_bt_first.

Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.

Matthias van de Meent, with tweaks by me.

Author: Matthias van de Meent <[email protected]>
Reported-By: Tomas Vondra <[email protected]>
Reviewed-By: Peter Geoghegan <[email protected]>
Discussion: https://postgr.es/m/CAH2-WzmMGaPa32u9x_FvEbPTUkP5e95i=QxR8054nvCRydP-sw@mail.gmail.com
Backpatch: 17-, where nbtree SAOP execution was enhanced.

Add temporal FOREIGN KEY contraints

Add PERIOD clause to foreign key constraint definitions. This is
supported for range and multirange types. Temporal foreign keys check
for range containment instead of equality.

This feature matches the behavior of the SQL standard temporal foreign
keys, but it works on PostgreSQL's native ranges instead of SQL's
"periods", which don't exist in PostgreSQL (yet).

Reference actions ON {UPDATE,DELETE} {CASCADE,SET NULL,SET DEFAULT}
are not supported yet.

(previously committed as 34768ee3616, reverted by 8aee330af55; this is
essentially unchanged from those)

Author: Paul A. Jungwirth <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]>
Reviewed-by: jian he <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com

Add temporal PRIMARY KEY and UNIQUE constraints

Add WITHOUT OVERLAPS clause to PRIMARY KEY and UNIQUE constraints.
These are backed by GiST indexes instead of B-tree indexes, since they
are essentially exclusion constraints with = for the scalar parts of
the key and && for the temporal part.

(previously committed as 46a0cd4cefb, reverted by 46a0cd4cefb; the new
part is this:)

Because 'empty' && 'empty' is false, the temporal PK/UQ constraint
allowed duplicates, which is confusing to users and breaks internal
expectations.  For instance, when GROUP BY checks functional
dependencies on the PK, it allows selecting other columns from the
table, but in the presence of duplicate keys you could get the value
from any of their rows.  So we need to forbid empties.

This all means that at the moment we can only support ranges and
multiranges for temporal PK/UQs, unlike the original patch (above).
Documentation and tests for this are added.  But this could
conceivably be extended by introducing some more general support for
the notion of "empty" for other types.

Author: Paul A. Jungwirth <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]>
Reviewed-by: jian he <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com

Add stratnum GiST support function

This is support function 12 for the GiST AM and translates
"well-known" RT*StrategyNumber values into whatever strategy number is
used by the opclass (since no particular numbers are actually
required). We will use this to support temporal PRIMARY
KEY/UNIQUE/FOREIGN KEY/FOR PORTION OF functionality.

This commit adds two implementations, one for internal GiST opclasses
(just an identity function) and another for btree_gist opclasses. It
updates btree_gist from 1.7 to 1.8, adding the support function for
all its opclasses.

(previously committed as 6db4598fcb8, reverted by 8aee330af55; this is
essentially unchanged from those)

Author: Paul A. Jungwirth <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]>
Reviewed-by: jian he <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com

Add memory/disk usage for Window aggregate nodes in EXPLAIN.

This commit is similar to 1eff8279d and expands the idea to Window
aggregate nodes so that users can know how much memory or disk the
tuplestore used.

This commit uses newly introduced tuplestore_get_stats() to inquire this
information and add some additional output in EXPLAIN ANALYZE to
display the information for the Window aggregate node.

Reviewed-by: David Rowley, Ashutosh Bapat, Maxim Orlov, Jian He
Discussion: https://postgr.es/m/20240706.202254.89740021795421286.ishii%40postgresql.org

Fix redefinition of typedef.

Per buildfarm members sifaka and longfin, clang with
-Wtypedef-redefinition warns of a duplicate typedef unless building
with C11.

Oversight in commit 40e2e5e92b.

pg_upgrade: Parallelize encoding conversion check.

This commit makes use of the new task framework in pg_upgrade to
parallelize the check for incompatible user-defined encoding
conversions, i.e., those defined on servers older than v14. This
step will now process multiple databases concurrently when
pg_upgrade's --jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

pg_upgrade: Parallelize WITH OIDS check.

This commit makes use of the new task framework in pg_upgrade to
parallelize the check for tables declared WITH OIDS. This step
will now process multiple databases concurrently when pg_upgrade's
--jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

pg_upgrade: Parallelize incompatible polymorphics check.

This commit makes use of the new task framework in pg_upgrade to
parallelize the check for usage of incompatible polymorphic
functions, i.e., those with arguments of type anyarray/anyelement
rather than the newer anycompatible variants. This step will now
process multiple databases concurrently when pg_upgrade's --jobs
option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

pg_upgrade: Parallelize postfix operator check.

This commit makes use of the new task framework in pg_upgrade to
parallelize the check for user-defined postfix operators. This
step will now process multiple databases concurrently when
pg_upgrade's --jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

pg_upgrade: Parallelize contrib/isn check.

This commit makes use of the new task framework in pg_upgrade to
parallelize the check for contrib/isn functions that rely on the
bigint data type. This step will now process multiple databases
concurrently when pg_upgrade's --jobs option is provided a value
greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

pg_upgrade: Parallelize data type checks.

This commit makes use of the new task framework in pg_upgrade to
parallelize the checks for incompatible data types, i.e., data
types whose on-disk format has changed, data types that have been
removed, etc. This step will now process multiple databases
concurrently when pg_upgrade's --jobs option is provided a value
greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

pg_upgrade: Parallelize retrieving extension updates.

This commit makes use of the new task framework in pg_upgrade to
parallelize retrieving the set of extensions that should be updated
with the ALTER EXTENSION command after upgrade. This step will now
process multiple databases concurrently when pg_upgrade's --jobs
option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

pg_upgrade: Parallelize retrieving loadable libraries.

This commit makes use of the new task framework in pg_upgrade to
parallelize retrieving the names of all libraries referenced by
non-built-in C functions. This step will now process multiple
databases concurrently when pg_upgrade's --jobs option is provided
a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

pg_upgrade: Parallelize subscription check.

This commit makes use of the new task framework in pg_upgrade to
parallelize the part of check_old_cluster_subscription_state() that
verifies each of the subscribed tables is in the 'i' (initialize)
or 'r' (ready) state. This check will now process multiple
databases concurrently when pg_upgrade's --jobs option is provided
a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

pg_upgrade: Parallelize retrieving relation information.

This commit makes use of the new task framework in pg_upgrade to
parallelize retrieving relation and logical slot information. This
step will now process multiple databases concurrently when
pg_upgrade's --jobs option is provided a value greater than 1.

Reviewed-by: Daniel Gustafsson, Ilya Gladyshev
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

Introduce framework for parallelizing various pg_upgrade tasks.

A number of pg_upgrade steps require connecting to every database
in the cluster and running the same query in each one.  When there
are many databases, these steps are particularly time-consuming,
especially since they are performed sequentially, i.e., we connect
to a database, run the query, and process the results before moving
on to the next database.

This commit introduces a new framework that makes it easy to
parallelize most of these once-in-each-database tasks by processing
multiple databases concurrently.  This framework manages a set of
slots that follow a simple state machine, and it uses libpq's
asynchronous APIs to establish the connections and run the queries.
The --jobs option is used to determine the number of slots to use.
To use this new task framework, callers simply need to provide the
query and a callback function to process its results, and the
framework takes care of the rest.  A more complete description is
provided at the top of the new task.c file.

None of the eligible once-in-each-database tasks are converted to
use this new framework in this commit.  That will be done via
several follow-up commits.

Reviewed-by: Jeff Davis, Robert Haas, Daniel Gustafsson, Ilya Gladyshev, Corey Huinker
Discussion: https://postgr.es/m/20240516211638.GA1688936%40nathanxps13

doc PG relnotes: fix SGML markup for new commit links

Backpatch-through: 12

scripts: add Perl script to add links to release notes

Reported-by: jian he
Discussion: https://postgr.es/m/[email protected]

Backpatch-through: 12

Perl scripts: revert 43ce181059d

Small improvement not worth the code churn.

Reported-by: Andrew Dunstan
Discussion: https://postgr.es/m/42f2242a-422b-4aa3-8d60-d67b229c4a52@dunslane.net

Backpatch-through: master

Replace usages of xmlXPathCompile() with xmlXPathCtxtCompile().

In existing releases of libxml2, xmlXPathCompile can be driven
to stack overflow because it fails to protect itself against
too-deeply-nested input.  While there is an upstream fix as of
yesterday, it will take years for that to propagate into all
shipping versions.  In the meantime, we can protect our own
usages basically for free by calling xmlXPathCtxtCompile instead.

(The actual bug is that libxml2 keeps its nesting counter in the
xmlXPathContext, and its parsing code was willing to just skip
counting nesting levels if it didn't have a context.  So if we supply
a context, all is well.  It seems odd actually that it works at all
to not supply a context, because this means that XPath parsing does
not have access to XML namespace info.  Apparently libxml2 never
checks namespaces until runtime?  Anyway, this seems like good
future-proofing even if its only immediate effect is to dodge a bug.)

Sadly, this hack only offers protection with libxml2 2.9.11 and newer.
Before that there are multiple similar problems, so if you are
processing untrusted XML it behooves you to get a newer version.
But we have some pretty old libxml2 in the buildfarm, so it seems
impractical to add a regression test to verify this fix.

Per bug #18617 from Jingzhou Fu.  Back-patch to all supported
versions.

Discussion: https://postgr.es/m/18617-1cee4d2ed1f4e7ae@postgresql.org
Discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/799

Perl scripts: eliminate "Useless interpolation" warnings

Eliminate warnings of Perl Critic from src/tools.

Backpatch-through: master

Run regression tests with timezone America/Los_Angeles.

Historically we've used timezone "PST8PDT", but the recent release
2024b of tzdb changes the definition of that zone in a way that
breaks many test cases concerned with dates before 1970. Although
we've not yet adopted 2024b into our own tree, this is already
problematic for people using --with-system-tzdata if their platform
has already adopted 2024b. To work with both older and newer
versions of tzdb, switch to using "America/Los_Angeles", accepting
the ensuing changes in regression test results.

Back-patch to all supported branches.

Per report and patch from Wolfgang Walther.

Discussion: https://postgr.es/m/0a997455-5aba-4cf2-a354-d26d8bcbfae6@technowledgy.de

Add commit 7229ebe011df to .git-blame-ignore-revs.

Remove obsolete comment in pg_stat_statements.

Commit 76db9cb63 removed the use of multiple nesting counters,
but missed one comment describing that arrangement.

Back-patch to v17 where 76db9cb63 came in, just to avoid confusion.

Julien Rouhaud

Discussion: https://postgr.es/m/gfcwh3zjxc2vygltapgo7g6yacdor5s4ynr234b6v2ohhuvt7m@gr42joxalenw

Improve meson's detection of perl build flags

The current method of detecting perl build flags breaks if the path to
perl contains a space. This change makes two improvements. First,
instead of getting a list of ldflags and ccdlflags and then trying to
filter those out of the reported ldopts, we tell perl to suppress
reporting those in the first instance. Second, it tells perl to parse
those and output them, one per line. Thus any space on the option in a
file name, for example, is preserved.

Issue reported off-list by Muralikrishna Bandaru

Discussion: https://postgr.es/01117f88-f465-bf6c-9362-083bd72ca305@dunslane.net

Backpatch to release 16.

Only define NO_THREAD_SAFE_LOCALE for MSVC plperl when required

Latest versions of Strawberry Perl define USE_THREAD_SAFE_LOCALE, and we
therefore get a handshake error when building against such instances.
The solution is to perform a test to see if USE_THREAD_SAFE_LOCALE is
defined and only define NO_THREAD_SAFE_LOCALE if it isn't.

Backpatch the meson.build fix back to release 16 and apply the same
logic to Mkvcbuild.pm in releases 12 through 16.

Original report of the issue from Muralikrishna Bandaru.

Allow _h_indexbuild() to be interrupted.

When we are building a hash index that is large enough to need
pre-sorting (larger than either maintenance_work_mem or NBuffers),
the initial sorting phase is interruptible, but the insertion
phase wasn't. Add the missing CHECK_FOR_INTERRUPTS().

Per bug #18616 from Alexander Lakhin. Back-patch to all
supported branches.

Pavel Borisov

Discussion: https://postgr.es/m/18616-acbb9e5caf41e964@postgresql.org

Add commit 2b03cfeea4 to .git-blame-ignore-revs.

Fix contrib/pageinspect's test for sequences.

I managed to break this test in two different ways in commit
05036a3155.

First, the output of the new call to tuple_data_split() on the test
sequence is dependent on endianness.  This is fixed by setting a
special start value for the test sequence that produces the same
output regardless of the endianness of the machine.

Second, on versions older than v15, the new test case fails under
"force_parallel_mode = regress" with the following error:

ERROR:  cannot access temporary tables during a parallel operation

This is because pageinspect's disk-accessing functions are
incorrectly marked PARALLEL SAFE on versions older than v15 (see
commit aeaaf520f4 for details).  This one is fixed by changing the
test sequence to be permanent.  The only reason it was previously
marked temporary was to avoid needing a DROP SEQUENCE command at
the end of the test.  Unlike some other tests in this file, the use
of a permanent sequence here shouldn't result in any test
instability like what was fixed by commit e2933a6e11.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZuOKOut5hhDlf_bP%40nathan
Backpatch-through: 12

Remove separate locale_is_c arguments

Since e9931bfb751, ctype_is_c is part of pg_locale_t.  Some functions
passed a pg_locale_t and a bool argument separately.  This can now be
combined into one argument.

Since some callers call MatchText() with locale 0, it is a bit
confusing whether this is all correct.  But it is the case that only
callers that pass a non-zero locale object to MatchText() end up
checking locale->ctype_is_c.  To make that flow a bit more
understandable, add the locale argument to MATCH_LOWER() and GETCHAR()
in like_match.c, instead of implicitly taking it from the outer scope.

Reviewed-by: Jeff Davis <[email protected]>
Discussion: https://www.postgresql.org/message-id/84d415fc-6780-419e-b16c-61a0ca819e2b@eisentraut.org

SQL/JSON: Update example in JSON_QUERY() documentation

Commit e6c45d85dc fixed the behavior of JSON_QUERY() when WITH
CONDITIONAL WRAPPER is used, but the documentation example wasn't
updated to reflect this change. This commit updates the example to
show the correct result.

Per off-list report from Andreas Ulbrich.

Backpatch-through: 17

Prohibit altering invalidated replication slots.

ALTER_REPLICATION_SLOT for invalid replication slots should not be allowed
because there is no way to get back the invalidated (logical) slot to
work.

Author: Bharath Rupireddy
Reviewed-by: Peter Smith, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4fSOMiKjQ3=2NVBMTZRTG8Ujg6jsK9z3EvOtvA4vzKQ@mail.gmail.com

pg_stat_statements: Add tests with extended query protocol

There are currently no tests in the tree checking that queries using the
extended query protocol are able to map with their query ID.

This can be achieved for some paths of the extended query protocol with
the psql meta-commands \bind or \bind_named, so let's add some tests
based on both.

I have found that to be a useful addition while working on a different
issue.

Discussion: https://postgr.es/m/[email protected]

Reintroduce support for sequences in pgstattuple and pageinspect.

Commit 4b82664156 restricted a number of functions provided by
contrib modules to only relations that use the "heap" table access
method. Sequences always use this table access method, but they do
not advertise as such in the pg_class system catalog, so the
aforementioned commit also (presumably unintentionally) removed
support for sequences from some of these functions. This commit
reintroduces said support for sequences to these functions and adds
a couple of relevant tests.

Co-authored-by: Ayush Vatsa
Reviewed-by: Robert Haas, Michael Paquier, Matthias van de Meent
Discussion: https://postgr.es/m/CACX%2BKaP3i%2Bi9tdPLjF5JCHVv93xobEdcd_eB%2B638VDvZ3i%3DcQA%40mail.gmail.com
Backpatch-through: 12

Simplify checks for deterministic collations.

Remove redundant checks for locale->collate_is_c now that we always
have a valid pg_locale_t.

Also, remove pg_locale_deterministic() wrapper, which is no longer
useful after commit e9931bfb75. Just check the field directly,
consistent with other fields in pg_locale_t.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se

Remove redundant check for default collation.

The operative check is for a deterministic collation, so the check for
DEFAULT_COLLATION is redundant. Furthermore, it will be wrong if we
ever support a non-deterministic default collation.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se

Make jsonpath .string() be immutable for datetimes.

Discussion of commit ed055d249 revealed that we don't actually
want jsonpath's .string() method to depend on DateStyle, nor
TimeZone either, because the non-"_tz" jsonpath functions are
supposed to be immutable.  Potentially we could allow a TimeZone
dependency in the "_tz" variants, but it seems better to just
uniformly define this method as returning the same string that
jsonb text output would do.  That's easier to implement too,
saving a couple dozen lines.

Patch by me, per complaint from Peter Eisentraut.  Back-patch
to v17 where this feature came in (in 66ea94e8e).  Also
back-patch ed055d249 to provide test cases.

Discussion: https://postgr.es/m/5e8879d0-a3c8-4be2-950f-d83aa2af953a@eisentraut.org

Add has_largeobject_privilege function.

This function checks whether a user has specific privileges on a large object,
identified by OID. The user can be provided by name, OID,
or default to the current user. If the specified large object doesn't exist,
the function returns NULL. It raises an error for a non-existent user name.
This behavior is basically consistent with other privilege inquiry functions
like has_table_privilege.

Bump catalog version.

Author: Yugo Nagata
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20240702163444.ab586f6075e502eb84f11b1a@sranhm.sraoss.co.jp

Deduplicate code in LargeObjectExists and myLargeObjectExists.

myLargeObjectExists() and LargeObjectExists() had nearly identical code,
except for handling snapshots. This commit renames myLargeObjectExists()
to LargeObjectExistsWithSnapshot() and refactors LargeObjectExists()
to call it internally, reducing duplication.

Author: Yugo Nagata
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/20240702163444.ab586f6075e502eb84f11b1a@sranhm.sraoss.co.jp

Remove hardcoded hash opclass function signature exceptions

hashvalidate(), which validates the signatures of support functions
for the hash AM, contained several hardcoded exceptions.  For example,
hash/date_ops support function 1 was hashint4(), which would
ordinarily fail validation because the function argument is int4, not
date.  But this works internally because int4 and date are of the same
size.  There are several more exceptions like this that happen to work
and were allowed historically but would now fail the function
signature validation.

This patch removes those exceptions by providing new support functions
that have the proper declared signatures.  They internally share most
of the code with the "wrong" functions they replace, so the behavior
is still the same.

With the exceptions gone, hashvalidate() is now simplified and relies
fully on check_amproc_signature().

hashvarlena() and hashvarlenaextended() are kept in pg_proc.dat
because some extensions currently use them to build hash functions for
their own types, and we need to keep exposing these functions as
"LANGUAGE internal" functions for that to continue to work.

Reviewed-by: Tom Lane <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/29c3b746-69e7-482a-b37c-dbbf7e5b009b@eisentraut.org

Doc: alphabetize aggregate function table

A few recent JSON aggregates have been added without much consideration
to the existing order. Put these back in alphabetical order (with the
exception of the JSONB variant of each JSON aggregate).

Author: Wolfgang Walther <[email protected]>
Reviewed-by: Marlene Reiterer <[email protected]>
Discussion: https://postgr.es/m/6a7b910c-3feb-4006-b817-9b4759cb6bb6%40technowledgy.de
Backpatch-through: 16, where these aggregates were added

Remove old RULE privilege completely.

The RULE privilege for tables was removed in v8.2, but for backward
compatibility, GRANT/REVOKE and privilege functions like
has_table_privilege continued to accept the RULE keyword without
any effect.

After discussions on pgsql-hackers, it was agreed that this compatibility
is no longer needed. Since it's been long enough since the deprecation,
we've decided to fully remove support for RULE privilege,
so GRANT/REVOKE and privilege functions will no longer accept it.

Author: Fujii Masao
Reviewed-by: Nathan Bossart
Discussion: https://postgr.es/m/976a3581-6939-457f-b947-fc3dc836c083@oss.nttdata.com

Don't overwrite scan key in systable_beginscan()

When systable_beginscan() and systable_beginscan_ordered() choose an
index scan, they remap the attribute numbers in the passed-in scan
keys to the attribute numbers of the index, and then write those
remapped attribute numbers back into the scan key passed by the
caller. This second part is surprising and gratuitous. It means that
a scan key cannot safely be used more than once (but it might
sometimes work, depending on circumstances). Also, there is no value
in providing these remapped attribute numbers back to the caller,
since they can't do anything with that.

Fix that by making a copy of the scan keys passed by the caller and
make the modifications there.

Also, some code that had to work around the previous situation is
simplified.

Discussion: https://www.postgresql.org/message-id/flat/f8c739d9-f48d-4187-b214-df3391ba41ab@eisentraut.org

Move logic related to WAL replay of Heap/Heap2 into its own file

This brings more clarity to heapam.c, by cleanly separating all the
logic related to WAL replay and the rest of Heap and Heap2, similarly
to other RMGRs like hash, btree, etc.

The header reorganization is also nice in heapam.c, cutting half of the
headers required.

Author: Li Yong
Reviewed-by: Sutou Kouhei, Michael Paquier
Discussion: https://postgr.es/m/EFE55E65-D7BD-4C6A-B630-91F43FD0771B@ebay.com

Adjust tuplestore stats API

1eff8279d added an API to tuplestore.c to allow callers to obtain
storage telemetry data. That API wasn't quite good enough for callers
that perform tuplestore_clear() as the telemetry functions only
accounted for the current state of the tuplestore, not the maximums
before tuplestore_clear() was called.

There's a pending patch that would like to add tuplestore telemetry
output to EXPLAIN ANALYZE for WindowAgg. That node type uses
tuplestore_clear() before moving to the next window partition and we
want to show the maximum space used, not the space used for the final
partition.

Reviewed-by: Tatsuo Ishii, Ashutosh Bapat
Discussion: https://postgres/m/CAApHDvoY8cibGcicLV0fNh=9JVx9PANcWvhkdjBnDCc9Quqytg@mail.gmail.com

SQL/JSON: Fix JSON_QUERY(... WITH CONDITIONAL WRAPPER)

Currently, when WITH CONDITIONAL WRAPPER is specified, array wrappers
are applied even to a single SQL/JSON item if it is a scalar JSON
value, but this behavior does not comply with the standard.

To fix, apply wrappers only when there are multiple SQL/JSON items
in the result.

Reported-by: Peter Eisentraut <[email protected]>
Author: Peter Eisentraut <[email protected]>
Author: Amit Langote <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
Discussion: https://postgr.es/m/8022e067-818b-45d3-8fab-6e0d94d03626%40eisentraut.org
Backpatch-through: 17

Remove incorrect Assert.

check_agglevels_and_constraints() asserted that if we find an
aggregate function in an EXPR_KIND_FROM_SUBSELECT expression, the
expression must be in a LATERAL subquery.  Alexander Lakhin found a
case where that's not so: because of the odd scoping rules for NEW/OLD
within a rule, a reference to NEW/OLD could cause an aggregate to be
considered top-level even though it's in an unmarked sub-select.
The error message that would be thrown seems sufficiently on-point,
so just remove the Assert.  (Hence, this is not a bug for production
builds.)

This Assert was added by me in commit eaccfded9 (9.3 era).  It looks
like I put it in to cross-check that the new logic for detecting
misplaced aggregates (using agglevelsup) caught the same cases that a
previous check on p_lateral_active did.  So there might have been some
related misbehavior before eaccfded9 ... but that's very ancient
history by now, so I didn't dig any deeper.

Per bug #18608 from Alexander Lakhin.  Back-patch to all supported
branches.

Discussion: https://postgr.es/m/18608-48de0717508ee429@postgresql.org

pg_createsubscriber: minor documentation fixes

Replace gratuitous memmove() with memcpy()

The index access methods all had similar code that copied the
passed-in scan keys to local storage. They all used memmove() for
that, which is not wrong, but it seems confusing not to use memcpy()
when that would work. Presumably, this was all once copied from
ancient code and never adjusted.

Discussion: https://www.postgresql.org/message-id/flat/f8c739d9-f48d-4187-b214-df3391ba41ab@eisentraut.org

Fix unique key checks in JSON object constructors

When building a JSON object, the code builds a hash table of keys, to
allow checking if the keys are unique. The uniqueness check and adding
the new key happens in json_unique_check_key(), but this assumes the
pointer to the key remains valid.

Unfortunately, two places passed pointers to keys in a buffer, while
also appending more data (additional key/value pairs) to the buffer.
With enough data the buffer is resized by enlargeStringInfo(), which
calls repalloc(), invalidating the earlier key pointers.

Due to this the uniqueness check may fail with both false negatives and
false positives, producing JSON objects with duplicate keys or failing
to produce a perfectly valid JSON object.

This affects multiple functions that enforce uniqueness of keys, all
introduced in PG16 with the new SQL/JSON:

- json_object_agg_unique / jsonb_object_agg_unique
- json_object / jsonb_objectagg

Existing regression tests did not detect the issue, simply because the
initial buffer size is 1024 and the objects were small enough not to
require the repalloc.

With a sufficiently large object, AddressSanitizer reported the access
to invalid memory immediately. So would valgrind, of course.

Fixed by copying the key into the hash table memory context, and adding
regression tests with enough data to repalloc the buffer. Backpatch to
16, where the functions were introduced.

Reported by Alexander Lakhin. Investigation and initial fix by Junwang
Zhao, with various improvements and tests by me.

Reported-by: Alexander Lakhin
Author: Junwang Zhao, Tomas Vondra
Backpatch-through: 16
Discussion: https://postgr.es/m/18598-3279ed972a2347c7@postgresql.org
Discussion: https://postgr.es/m/CAEG8a3JjH0ReJF2_O7-8LuEbO69BxPhYeXs95_x7+H9AMWF1gw@mail.gmail.com

Update .gitignore

for commit 0785d1b8b2

Remove obsolete unconstify()

This is no longer needed as of OpenSSL 1.1.0 (the current minimum
version). LibreSSL made the same change around the same time as well.

Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/20463f79-a7b0-4bba-a178-d805f99c02f9%40eisentraut.org

common/jsonapi: support libpq as a client

Based on a patch by Michael Paquier.

For libpq, use PQExpBuffer instead of StringInfo. This requires us to
track allocation failures so that we can return JSON_OUT_OF_MEMORY as
needed rather than exit()ing.

Author: Jacob Champion <[email protected]>
Co-authored-by: Michael Paquier <[email protected]>
Co-authored-by: Daniel Gustafsson <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/d1b467a78e0e36ed85a09adf979d04cf124a9d4b [email protected]

Improve assertion in FindReplTupleInLocalRel().

The first part of the assertion verifying that the passed index must be PK
or RI was incorrectly passing index relation instead of heap relation in
GetRelationIdentityOrPK(). The assertion was not failing because the
second part of the assertion which needs to be performed only when remote
relation has REPLICA_IDENTITY_FULL set was also incorrect.

The change is not backpatched because the current coding doesn't lead to
any failure.

Reported-by: Dilip Kumar
Author: Amit Kapila
Reviewed-by: Vignesh C
Discussion: https://postgr.es/m/CAFiTN-tmguaT1DXbCC+ZomZg-oZLmU6BPhr0po7akQSG6vNJrg@mail.gmail.com

Optimize pg_visibility with read streams.

We've measured 5% performance improvement, and this arranges to benefit
automatically from future optimizations to the read_stream subsystem.
The area lacked test coverage, so close that gap.

Nazir Bilal Yavuz

Discussion: https://postgr.es/m/CAN55FZ1_Ru3XpMgTwsU67FTH2fs_FrRROmb7x6zs+F44QBEiww@mail.gmail.com
Discussion: https://postgr.es/m/CAEudQAozv3wTY5TV2t29JcwPydbmKbiWQkZD42S2OgzdixPMDQ@mail.gmail.com

Use a hash table to de-duplicate column names in ruleutils.c.

Commit 8004953b5 added a hash table to avoid O(N^2) cost in choosing
unique relation aliases while deparsing a view or rule.  It did
nothing about the similar O(N^2) (maybe worse) costs of choosing
unique column aliases within each RTE.  However, that's now
demonstrably a bottleneck when deparsing CHECK constraints for wide
tables, so let's use a similar hash table to handle those.

The extra cost of setting up the hash table will not be repaid unless
the table has many columns.  I've set this up so that we use the brute
force method if there are less than 32 columns.  The exact cutoff is
not too critical, but this value seems good because it results in both
code paths getting exercised by existing regression-test cases.

Patch by me; thanks to David Rowley for review.

Discussion: https://postgr.es/m/2885468.1722291250@sss.pgh.pa.us

Fix some whitespace issues in XMLSERIALIZE(... INDENT).

We must drop whitespace while parsing the input, else libxml2
will include "blank" nodes that interfere with the desired
indentation behavior.  The end result is that we didn't indent
nodes separated by whitespace.

Also, it seems that libxml2 may add a trailing newline when working
in DOCUMENT mode.  This is semantically insignificant, so strip it.

This is in the gray area between being a bug fix and a definition
change.  However, the INDENT option is still pretty new (since v16),
so I think we can get away with changing this in stable branches.
Hence, back-patch to v16.

Jim Jones

Discussion: https://postgr.es/m/872865a8-548b-48e1-bfcd-4e38e672c1e4@uni-muenster.de

Improve documentation and testing of jsonpath string() for datetimes.

Point out that the output format depends on DateStyle, and test that,
along with testing some cases previously not covered.

In passing, adjust the horology test to verify that the prevailing
DateStyle is 'Postgres, MDY', much as it has long verified the
prevailing TimeZone. We expect pg_regress to have set these up,
and there are multiple regression tests relying on these settings.

Also make the formatting of entries in table 9.50 more consistent.

David Wheeler (marginal additional hacking by me); review by jian he

Discussion: https://postgr.es/m/56955B33-6959-4FDA-A459-F00363ECDFEE@justatheory.com

Add PG_TEST_PG_COMBINEBACKUP_MODE to CI tasks

The environment variable PG_TEST_PG_COMBINEBACKUP_MODE has been
available since 35a7b288b975, but was not set by any built-in CI tasks.
This commit modifies two of the CI tasks to use the alternative modes,
to exercise the pg_combinebackup code.

The Linux task uses --copy-file-range, macOS uses --clone.

This is not an exhaustive test of combinations. The supported modes
depend on the operating system and filesystem, and it would be nice to
test all supported combinations. Right now we have just one task for
each OS, and it doesn't seem worth adding more just for this.

Reported-by: Peter Eisentraut
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/48da4a1f-ccd9-4988-9622-24f37b1de2b4%40eisentraut.org

Protect against small overread in SASLprep validation

In case of torn UTF8 in the input data we might end up going
past the end of the string since we don't account for length.
While validation won't be performed on a sequence with a NULL
byte it's better to avoid going past the end to beging with.
Fix by taking the length into consideration.

Author: Jacob Champion <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://postgr.es/m/CAOYmi+mTnmM172g=_+Yvc47hzzeAsYPy2C4UBY3HK9p-AXNV0g@mail.gmail.com

Add amgettreeheight index AM API routine

The only current implementation is for btree where it calls
_bt_getrootheight(). Other index types can now also use this to pass
information to their amcostestimate routine. Previously, btree was
hardcoded and other index types could not hook into the optimizer at
this point.

Author: Mark Dilger <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com

Mark expressions nullable by grouping sets

When generating window_pathkeys, distinct_pathkeys, or sort_pathkeys,
we failed to realize that the grouping/ordering expressions might be
nullable by grouping sets.  As a result, we may incorrectly deem that
the PathKeys are redundant by EquivalenceClass processing and thus
remove them from the pathkeys list.  That would lead to wrong results
in some cases.

To fix this issue, we mark the grouping expressions nullable by
grouping sets if that is the case.  If the grouping expression is a
Var or PlaceHolderVar or constructed from those, we can just add the
RT index of the RTE_GROUP RTE to the existing nullingrels field(s);
otherwise we have to add a PlaceHolderVar to carry on the nullingrel
bit.

However, we have to manually remove this nullingrel bit from
expressions in various cases where these expressions are logically
below the grouping step, such as when we generate groupClause pathkeys
for grouping sets, or when we generate PathTarget for initial input to
grouping nodes.

Furthermore, in set_upper_references, the targetlist and quals of an
Agg node should have nullingrels that include the effects of the
grouping step, ie they will have nullingrels equal to the input
Vars/PHVs' nullingrels plus the nullingrel bit that references the
grouping RTE.  In order to perform exact nullingrels matches, we also
need to manually remove this nullingrel bit.

Bump catversion because this changes the querytree produced by the
parser.

Thanks to Tom Lane for the idea to invent a new kind of RTE.

Per reports from Geoff Winkless, Tobias Wendorff, Richard Guo from
various threads.

Author: Richard Guo
Reviewed-by: Ashutosh Bapat, Sutou Kouhei
Discussion: https://postgr.es/m/CAMbWs4_dp7e7oTwaiZeBX8+P1rXw4ThkZxh1QG81rhu9Z47VsQ@mail.gmail.com

Introduce an RTE for the grouping step

If there are subqueries in the grouping expressions, each of these
subqueries in the targetlist and HAVING clause is expanded into
distinct SubPlan nodes.  As a result, only one of these SubPlan nodes
would be converted to reference to the grouping key column output by
the Agg node; others would have to get evaluated afresh.  This is not
efficient, and with grouping sets this can cause wrong results issues
in cases where they should go to NULL because they are from the wrong
grouping set.  Furthermore, during re-evaluation, these SubPlan nodes
might use nulled column values from grouping sets, which is not
correct.

This issue is not limited to subqueries.  For other types of
expressions that are part of grouping items, if they are transformed
into another form during preprocessing, they may fail to match lower
target items.  This can also lead to wrong results with grouping sets.

To fix this issue, we introduce a new kind of RTE representing the
output of the grouping step, with columns that are the Vars or
expressions being grouped on.  In the parser, we replace the grouping
expressions in the targetlist and HAVING clause with Vars referencing
this new RTE, so that the output of the parser directly expresses the
semantic requirement that the grouping expressions be gotten from the
grouping output rather than computed some other way.  In the planner,
we first preprocess all the columns of this new RTE and then replace
any Vars in the targetlist and HAVING clause that reference this new
RTE with the underlying grouping expressions, so that we will have
only one instance of a SubPlan node for each subquery contained in the
grouping expressions.

Bump catversion because this changes the querytree produced by the
parser.

Thanks to Tom Lane for the idea to invent a new kind of RTE.

Per reports from Geoff Winkless, Tobias Wendorff, Richard Guo from
various threads.

Author: Richard Guo
Reviewed-by: Ashutosh Bapat, Sutou Kouhei
Discussion: https://postgr.es/m/CAMbWs4_dp7e7oTwaiZeBX8+P1rXw4ThkZxh1QG81rhu9Z47VsQ@mail.gmail.com

Remove emode argument from XLogFileRead() and XLogFileReadAnyTLI()

This change makes the code slightly easier to reason about, because
there is actually no need to know if a specific caller of one of these
routines should fail hard on a PANIC, or just let it go through with a
DEBUG2.

The only caller of XLogFileReadAnyTLI() used DEBUG2, and XLogFileRead()
has never used its emode. This can be simplified since 1bb2558046cc
that has introduced XLogFileReadAnyTLI(), splitting both.

Author: Yugo Nagata
Discussion: https://postgr.es/m/20240906201043.a640f3b44e755d4db2b6943e@sraoss.co.jp

Add WAL usage reporting to ANALYZE VERBOSE output.

This change adds WAL usage reporting to the output of ANALYZE VERBOSE
and autoanalyze reports. It aligns the analyze output with VACUUM,
providing consistency. Additionally, it aids in troubleshooting cases
where WAL records are generated during analyze operations.

Author: Anthonin Bonnefoy
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/CAO6_Xqr__kTTCLkftqS0qSCm-J7_xbRG3Ge2rWhucxQJMJhcRA%40mail.gmail.com

Consistently use PageGetExactFreeSpace() in pgstattuple.

Previously this code used PageGetHeapFreeSpace on heap pages,
and usually used PageGetFreeSpace on index pages (though for some
reason GetHashPageStats used PageGetExactFreeSpace instead).
The difference is that those functions subtract off the size of
a line pointer, and PageGetHeapFreeSpace has some additional
rules about returning zero if adding another line pointer would
require exceeding MaxHeapTuplesPerPage.  Those things make sense
when testing to see if a new tuple can be put on the page, but
they seem pretty strange for pure statistics collection.

Additionally, statapprox_heap had a special rule about counting
a "new" page as being fully available space.  This also seems
strange, because it's not actually usable until VACUUM or some
such process initializes the page.  Moreover, it's inconsistent
with what pgstat_heap does, which is to count such a page as
having zero free space.  So make it work like pgstat_heap, which
as of this patch unconditionally calls PageGetExactFreeSpace.

This is more of a definitional change than a bug fix, so no
back-patch.  The module's documentation doesn't define exactly
what "free space" means either, so we left that as-is.

Frédéric Yhuel, reviewed by Rafia Sabih and Andreas Karlsson.

Discussion: https://postgr.es/m/3a18f843-76f6-4a84-8cca-49537fefa15d@dalibo.com

Don't bother checking the result of SPI_connect[_ext] anymore.

SPI_connect/SPI_connect_ext have not returned any value other than
SPI_OK_CONNECT since commit 1833f1a1c in v10; any errors are thrown
via ereport.  (The most likely failure is out-of-memory, which has
always been thrown that way, so callers had better be prepared for
such errors.)  This makes it somewhat pointless to check these
functions' result, and some callers within our code haven't been
bothering; indeed, the only usage example within spi.sgml doesn't
bother.  So it's likely that the omission has propagated into
extensions too.

Hence, let's standardize on not checking, and document the return
value as historical, while not actually changing these functions'
behavior.  (The original proposal was to change their return type
to "void", but that would needlessly break extensions that are
conforming to the old practice.)  This saves a small amount of
boilerplate code in a lot of places.

Stepan Neretin

Discussion: https://postgr.es/m/CAMaYL5Z9Uk8cD9qGz9QaZ2UBJFOu7jFx5Mwbznz-1tBbPDQZow@mail.gmail.com

Add PQfullProtocolVersion() to surface the precise protocol version.

The existing function PQprotocolVersion() does not include the minor
version of the protocol. In preparation for pending work that will
bump that number for the first time, add a new function to provide it
to clients that may care, using the (major * 10000 + minor)
convention already used by PQserverVersion().

Jacob Champion based on earlier work by Jelte Fennema-Nio

Discussion: http://postgr.es/m/CAOYmi+mM8+6Swt1k7XsLcichJv8xdhPnuNv7-02zJWsezuDL+g@mail.gmail.com

Fix waits of REINDEX CONCURRENTLY for indexes with predicates or expressions

As introduced by f9900df5f94, a REINDEX CONCURRENTLY job done for an
index with predicates or expressions would set PROC_IN_SAFE_IC in its
MyProc->statusFlags, causing it to be ignored by other concurrent
operations.

Such concurrent index rebuilds should never be ignored, as a predicate
or an expression could call a user-defined function that accesses a
different table than the table where the index is rebuilt.

A test that uses injection points is added, backpatched down to 17.
Michail has proposed a different test, but I have added something
simpler with more coverage.

Oversight in f9900df5f949.

Author: Michail Nikolaev
Discussion: https://postgr.es/m/CANtu0oj9A3kZVduFTG0vrmGnKB+DCHgEpzOp0qAyOgmks84j0w@mail.gmail.com
Backpatch-through: 14

SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY steps

When the ON ERROR / ON EMPTY behavior is to return NULL, returning
NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's
no need to create separate steps to check the error/empty flag or
those to evaluate the the constant NULL expression. This speeds up
common cases because the default ON ERROR / ON EMPTY behavior for
JSON_QUERY() and JSON_VALUE() is to return NULL. However, these steps
are necessary if the RETURNING type is a domain, as constraints on the
domain may need to be checked.

Reported-by: Jian He <[email protected]>
Author: Jian He <[email protected]>
Author: Amit Langote <[email protected]>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17

Fix order of parameters in a cost_sort call

In label_sort_with_costsize, the cost_sort function is called with the
parameters 'input_disabled_nodes' and 'input_cost' in the wrong order.
This does not cause any plan diffs in the regression tests, because
label_sort_with_costsize is only used to label the Sort node nicely
for EXPLAIN, and cost numbers are not displayed in regression tests.

Oversight in e22253467. Fixed by passing arguments in the right
order.

Per report from Alexander Lakhin running UBSan.

Author: Alexander Lakhin
Discussion: https://postgr.es/m/a9b7231d-68bc-f117-a07c-96688f3e6aef@gmail.com

Add callbacks to control flush of fixed-numbered stats

This commit adds two callbacks in pgstats to have a better control of
the flush timing of pgstat_report_stat(), whose operation depends on the
three PGSTAT_*_INTERVAL variables:
- have_fixed_pending_cb(), to check if a stats kind has any pending
data waiting for a flush. This is used as a fast path if there are no
pending statistics to flush, and this check is done for fixed-numbered
statistics only if there are no variable-numbered statistics to flush.
A flush will need to happen if at least one callback reports any pending
data.
- flush_fixed_cb(), to do the actual flush.

These callbacks are currently used by the SLRU, WAL and IO statistics,
generalizing the concept for all stats kinds (builtin and custom).

The SLRU and IO stats relied each on one global variable to determine
whether a flush should happen; these are now local to pgstat_slru.c and
pgstat_io.c, cleaning up a bit how the pending flush states are tracked
in pgstat.c.

pgstat_flush_io() and pgstat_flush_wal() are still required, but we do
not need to check their return result anymore.

Reviewed-by: Bertrand Drouvot, Kyotaro Horiguchi
Discussion: https://postgr.es/m/[email protected]

Avoid core dump after getpwuid_r failure.

Looking up a nonexistent user ID would lead to a null pointer
dereference. That's unlikely to happen here, but perhaps
not impossible.

Thinko in commit 4d5111b3f, noticed by Coverity.

Update extension lookup routines to use the syscache

The following routines are changed to use the syscache entries added for
pg_extension in 490f869d92e5:
- get_extension_oid()
- get_extension_name()
- get_extension_schema()

A catalog scan is costly and could easily lead to a noticeable
performance impact when called once or more per query, so this is going
to be helpful for developers for extension data lookups.

Author: Andrei Lepikhov
Reviewed-by: Jelte Fennema-Nio
Discussion: https://postgr.es/m/529295b2-6ba9-4dae-acd1-20a9c6fb8f9a@gmail.com

Remove lc_ctype_is_c().

Instead always fetch the locale and look at the ctype_is_c field.

hba.c relies on regexes working for the C locale without needing
catalog access, which worked before due to a special case for
C_COLLATION_OID in lc_ctype_is_c(). Move the special case to
pg_set_regex_collation() now that lc_ctype_is_c() is gone.

Author: Andreas Karlsson
Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se

Fix incorrect pg_stat_io output on 32-bit machines.

pg_stat_get_io() applied TimestampTzGetDatum twice to the
stat_reset_timestamp value. On 64-bit builds that's harmless because
TimestampTzGetDatum is a no-op, but on 32-bit builds it results in
displaying garbage in the stats_reset column of the pg_stat_io view.

Bug dates to commit a9c70b46d which introduced pg_stat_io, so
back-patch to v16 where that came in.

Bertrand Drouvot

Discussion: https://postgr.es/m/[email protected]

Remove useless unconstify

Digging into the history, this was not necessary even when it was
added, but might have been some time before that. In any case, there
is no use for this now.

SQL/JSON: Fix default ON ERROR behavior for JSON_TABLE

Use EMPTY ARRAY instead of EMPTY.

This change does not affect the runtime behavior of JSON_TABLE(),
which continues to return an empty relation ON ERROR. It only alters
whether the default ON ERROR behavior is shown in the deparsed output.

Reported-by: Jian He <[email protected]>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17

SQL/JSON: Fix JSON_TABLE() column deparsing

The deparsing code in get_json_expr_options() unnecessarily emitted
the default column-specific ON ERROR / EMPTY behavior when the
top-level ON ERROR behavior in JSON_TABLE was set to ERROR. Fix that
by not overriding the column-specific default, determined based on
the column's JsonExprOp in get_json_table_columns(), with
JSON_BEHAVIOR_ERROR when that is the top-level ON ERROR behavior.

Note that this only removes redundancy; the current deparsing output
is not incorrect, just redundant.

Reviewed-by: Jian He <[email protected]>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17

Revert recent SQL/JSON related commits

Reverts 68222851d5a8, 565caaa79af, and 3a97460970f, because a few
BF animals didn't like one or all of them.

SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY steps

When the ON ERROR / ON EMPTY behavior is to return NULL, returning
NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's
no need to create separate steps to check the error/empty flag or
those to evaluate the the constant NULL expression. This speeds up
common cases because the default ON ERROR / ON EMPTY behavior for
JSON_QUERY() and JSON_VALUE() is to return NULL. However, these steps
are necessary if the RETURNING type is a domain, as constraints on the
domain may need to be checked.

Reported-by: Jian He <[email protected]>
Author: Jian He <[email protected]>
Author: Amit Langote <[email protected]>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17

SQL/JSON: Fix default ON ERROR behavior for JSON_TABLE

Use EMPTY ARRAY instead of EMPTY.

This change does not affect the runtime behavior of JSON_TABLE(),
which continues to return an empty relation ON ERROR. It only alters
whether the default ON ERROR behavior is shown in the deparsed output.

Reported-by: Jian He <[email protected]>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17

SQL/JSON: Fix JSON_TABLE() column deparsing

The deparsing code in get_json_expr_options() unnecessarily emitted
the default column-specific ON ERROR / EMPTY behavior when the
top-level ON ERROR behavior in JSON_TABLE was set to ERROR. Fix that
by not overriding the column-specific default, determined based on
the column's JsonExprOp in get_json_table_columns(), with
JSON_BEHAVIOR_ERROR when that is the top-level ON ERROR behavior.

Note that this only removes redundancy; the current deparsing output
is not incorrect, just redundant.

Reviewed-by: Jian He <[email protected]>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17

Update comment about ExprState.escontext

The updated comment provides more helpful guidance by mentioning that
escontext should be set when soft error handling is needed.

Reported-by: Jian He <[email protected]>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17

Be more careful with error paths in pg_set_regex_collation().

Set global variables after error paths so that they don't end up in an
inconsistent state.

The inconsistent state doesn't lead to an actual problem, because
after an error, pg_set_regex_collation() will be called again before
the globals are accessed.

Change extracted from patch by Andreas Karlsson, though not discussed
explicitly.

Discussion: https://postgr.es/m/60929555-4709-40a7-b136-bcb44cff5a3c@proxel.se

Prevent mis-encoding of "trailing junk after numeric literal" errors.

Since commit 2549f0661, we reject an identifier immediately following
a numeric literal (without separating whitespace), because that risks
ambiguity with hex/octal/binary integers.  However, that patch used
token patterns like "{integer}{ident_start}", which is problematic
because {ident_start} matches only a single byte.  If the first
character after the integer is a multibyte character, this ends up
with flex reporting an error message that includes a partial multibyte
character.  That can cause assorted bad-encoding problems downstream,
both in the report to the client and in the postmaster log file.

To fix, use {identifier} not {ident_start} in the "junk" token
patterns, so that they will match complete multibyte characters.
This seems generally better user experience quite aside from the
encoding problem: for "123abc" the error message will now say that
the error appeared at or near "123abc" instead of "123a".

While at it, add some commentary about why these patterns exist
and how they work.

Report and patch by Karina Litskevich; review by Pavel Borisov.
Back-patch to v15 where the problem came in.

Discussion: https://postgr.es/m/CACiT8iZ_diop=0zJ7zuY3BXegJpkKK1Av-PU7xh0EDYHsa5+=g@mail.gmail.com

Fix handling of NULL return value in typarray lookup

Commit 6ebeeae29 accidentally omitted testing the return value from
findTypeByOid which can return NULL. Fix by adding a check to make
sure that we have a pointer to dereference.

Author: Ranier Vilela <[email protected]>
Reviewed-by: Nathan Bossart <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://postgr.es/m/CAEudQAqfMTH8Ya_J6E-NW_y_JyDFDxtQ4V_g6nY_1=0oDbQqdg@mail.gmail.com

Fix misleading error message context

Author: Pavel Stehule <[email protected]>
Reviewed-by: Stepan Neretin <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CAFj8pRAw+OkVW=FgMKHKyvY3CgtWy3cWdY7XT+S5TJaTttu=oA@mail.gmail.com

Add callback for backend initialization in pgstats

pgstat_initialize() is currently used by the WAL stats as a code path to
take some custom actions when a backend starts. A callback is added to
generalize the concept so as all stats kinds can do the same, for
builtin and custom kinds, if set.

Reviewed-by: Bertrand Drouvot, Kyotaro Horiguchi
Discussion: https://postgr.es/m/[email protected]