git.postgresql.org Git - postgresql.git/log

Fix extreme skew detection in Parallel Hash Join.

After repartitioning the inner side of a hash join that would have
exceeded the allowed size, we check if all the tuples from a parent
partition moved to one child partition.  That is evidence that it
contains duplicate keys and later attempts to repartition will also
fail, so we should give up trying to limit memory (for lack of a better
fallback strategy).

A thinko prevented the check from working correctly in partition 0 (the
one that is partially loaded into memory already).  After
repartitioning, we should check for extreme skew if the *parent*
partition's space_exhausted flag was set, not the child partition's.
The consequence was repeated futile repartitioning until per-partition
data exceeded various limits including "ERROR: invalid DSA memory alloc
request size 1811939328", OS allocation failure, or temporary disk space
errors.  (We could also do something about some of those symptoms, but
that's material for separate patches.)

This problem only became likely when PostgreSQL 16 introduced support
for Parallel Hash Right/Full Join, allowing NULL keys into the hash
table.  Repartitioning always leaves NULL in partition 0, no matter how
many times you do it, because the hash value is all zero bits.  That's
unlikely for other hashed values, but they might still have caused
wasted extra effort before giving up.

Back-patch to all supported releases.

Reported-by: Craig Milhiser <[email protected]>
Reviewed-by: Andrei Lepikhov <[email protected]>
Discussion: https://postgr.es/m/CA%2BwnhO1OfgXbmXgC4fv_uu%3DOxcDQuHvfoQ4k0DFeB0Qqd-X-rQ%40mail.gmail.com

Remove superfluous forward declaration

The need for this was removed by commit dc9c3b0ff21.

Fix whitespace

Fix unnecessary casts of copyObject() result

The result is already of the correct type, so these casts don't do
anything.

Reviewed-by: Nathan Bossart <[email protected]>
Reviewed-by: Tender Wang <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/637eeea8-5663-460b-a114-39572c0f6c6e%40eisentraut.org

Improve node type forward reference

Instead of using Node *, we can use an incomplete struct. That way,
everything has the correct type and fewer casts are required. This
technique is already used elsewhere in node type definitions.

Reviewed-by: Nathan Bossart <[email protected]>
Reviewed-by: Tender Wang <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/637eeea8-5663-460b-a114-39572c0f6c6e%40eisentraut.org

jsonapi: fully initialize dummy lexer

Valgrind reports that checks on lex->inc_state are undefined for the
"dummy lexer" used for incremental parsing, since it's only partially
initialized on the stack. This was introduced in 0785d1b8b2.
Zero-initialize the whole struct.

Author: Jacob Champion <[email protected]>
Reported-by: Alexander Lakhin <[email protected]>
Discussion: https://www.postgresql.org/message-id/CAOYmi+n9QWr4gsAADZc6qFQjFViXQYVk=gBy_EvxuqsgPJcb_g@mail.gmail.com

Fix unusual include style

Project-internal header files should be included using " ", not < >.

Don't store intermediate hash values in ExprState->resvalue

adf97c156 made it so ExprStates could support hashing and changed Hash
Join to use that instead of manually extracting Datums from tuples and
hashing them one column at a time.

When hashing multiple columns or expressions, the code added in that
commit stored the intermediate hash value in the ExprState's resvalue
field. That was a mistake as steps may be injected into the ExprState
between each hashing step that look at or overwrite the stored
intermediate hash value. EEOP_PARAM_SET is an example of such a step.

Here we fix this by adding a new dedicated field for storing
intermediate hash values and adjust the code so that all apart from the
final hashing step store their result in the intermediate field.

In passing, rename a variable so that it's more aligned to the
surrounding code and also so a few lines stay within the 80 char margin.

Reported-by: Andres Freund
Reviewed-by: Alena Rybakina <[email protected]>
Discussion: https://postgr.es/m/CAApHDvqo9eenEFXND5zZ9JxO_k4eTA4jKMGxSyjdTrsmYvnmZw@mail.gmail.com

Fix validation of COPY FORCE_NOT_NULL/FORCE_NULL for the all-column cases

This commit adds missing checks for COPY FORCE_NOT_NULL and FORCE_NULL
when applied to all columns via "*". These options now correctly
require CSV mode and are disallowed in COPY TO, making their behavior
consistent with FORCE_QUOTE.

Some regression tests are added to verify the correct behavior for the
all-columns case, including FORCE_QUOTE, which was not tested.

Backpatch down to 17, where support for the all-column grammar with
FORCE_NOT_NULL and FORCE_NULL has been added.

Author: Joel Jacobson
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/65030d1d-5f90-4fa4-92eb-f5f50389858e@app.fastmail.com
Backpatch-through: 17

Rewrite some regression queries for option checks with COPY

Some queries in copy2 are there to check various option combinations,
and used "stdin" or "stdout" incompatible with the COPY TO or FROM
clauses combined with them, which was confusing.  This commit rewrites
these queries to use a compatible grammar.

The coverage of the tests is unchanged.  Like the original commit
451d1164b9d0, backpatch down to 16 where these have been introduced.  A
follow-up commit will rely on this area of the tests for a bug fix.

Author: Joel Jacobson
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/65030d1d-5f90-4fa4-92eb-f5f50389858e@app.fastmail.com
Backpatch-through: 16

nbtree: fix read page recheck typo.

Oversight in commit 79fa7b3b.

Further refine _SPI_execute_plan's rule for atomic execution.

Commit 2dc1deaea turns out to have been still a brick shy of a load,
because CALL statements executing within a plpgsql exception block
could still pass the wrong snapshot to stable functions within the
CALL's argument list.  That happened because standard_ProcessUtility
forces isAtomicContext to true if IsTransactionBlock is true, which
it always will be inside a subtransaction.  Then ExecuteCallStmt
would think it does not need to push a new snapshot --- but
_SPI_execute_plan didn't do so either, since it thought it was in
nonatomic mode.

The best fix for this seems to be for _SPI_execute_plan to operate
in atomic execution mode if IsSubTransaction() is true, even when the
SPI context as a whole is non-atomic.  This makes _SPI_execute_plan
have the same rules about when non-atomic execution is allowed as
_SPI_commit/_SPI_rollback have about when COMMIT/ROLLBACK are allowed,
which seems appropriately symmetric.  (If anyone ever tries to allow
COMMIT/ROLLBACK inside a subtransaction, this would all need to be
rethought ... but I'm unconvinced that such a thing could be logically
consistent at all.)

For further consistency, also check IsSubTransaction() in
SPI_inside_nonatomic_context.  That does not matter for its
one present-day caller StartTransaction, which can't be reached
inside a subtransaction.  But if any other callers ever arise,
they'd presumably want this definition.

Per bug #18656 from Alexander Alehin.  Back-patch to all
supported branches, like previous fixes in this area.

Discussion: https://postgr.es/m/18656-cade1780866ef66c@postgresql.org

Whitespace fixup from generated unicode tables.

When running the 'update-unicode' build target, generate files that
conform to pgindent whitespace rules.

Fix #include order from e839c8ecc9.

Reported-by: Alexander Korotkov
Discussion: https://postgr.es/m/CAPpHfduAiGSsvUc614Z-JOnyQffcMeJncWMF2HnUL8wFy4fuWA@mail.gmail.com

Reduce memory block size for decoded tuple storage to 8kB.

Commit a4ccc1cef introduced the Generation Context and modified the
logical decoding process to use a Generation Context with a fixed
block size of 8MB for storing tuple data decoded during logical
decoding (i.e., rb->tup_context). Several reports have indicated that
the logical decoding process can be terminated due to
out-of-memory (OOM) situations caused by excessive memory usage in
rb->tup_context.

This issue can occur when decoding a workload involving several
concurrent transactions, including a long-running transaction that
modifies tuples. By design, the Generation Context does not free a
memory block until all chunks within that block are
released. Consequently, if tuples modified by the long-running
transaction are stored across multiple memory blocks, these blocks
remain allocated until the long-running transaction completes, leading
to substantial memory fragmentation. The memory usage during logical
decoding, tracked by rb->size, does not account for memory
fragmentation, resulting in potentially much higher memory consumption
than the value of the logical_decoding_work_mem parameter.

Various improvement strategies were discussed in the relevant
thread. This change reduces the block size of the Generation Context
used in rb->tup_context from 8MB to 8kB. This modification
significantly decreases the likelihood of substantial memory
fragmentation occurring and is relatively straightforward to
backport. Performance testing across multiple platforms has confirmed
that this change will not introduce any performance degradation that
would impact actual operation.

Backport to all supported branches.

Reported-by: Alex Richman, Michael Guissine, Avi Weinberg
Reviewed-by: Amit Kapila, Fujii Masao, David Rowley
Tested-by: Hayato Kuroda, Shlok Kyal
Discussion: https://postgr.es/m/CAD21AoBTY1LATZUmvSXEssvq07qDZufV4AF-OHh9VD2pC0VY2A%40mail.gmail.com
Backpatch-through: 12

ecpg: fix some minor mishandling of bad input in preprocessor.

Avoid null-pointer crash when considering a cursor declaration
that's outside any C function (a case which is useless anyway).

Ensure a cursor for a prepared statement is marked as initially
not open. At worst, if we chanced to get not-already-zeroed memory
from malloc(), this oversight would result in failing to issue a
"cursor "foo" has been declared but not opened" warning that would
have been appropriate.

Avoid running off the end of the buffer when there are mismatched
square brackets following a variable name. This could lead to
SIGSEGV after reaching the end of memory.

Given the lack of field complaints, none of these seem to be worth
back-patching, but let's clean them up in HEAD.

Per valgrind testing by Alexander Lakhin.

Discussion: https://postgr.es/m/5f5bcecd-d7ec-b8c0-6c92-d1a7c6e0f639@gmail.com

Normalize nbtree truncated high key array behavior.

Commit 5bf748b8 taught nbtree ScalarArrayOp index scans to decide when
and how to start the next primitive index scan based on physical index
characteristics.  This included rules for deciding whether to start a
new primitive index scan (or whether to move onto the right sibling leaf
page instead) that specifically consider truncated lower-order columns
(-inf columns) from leaf page high keys.

These omitted columns were treated as satisfying the scan's required
scan keys, though only for scan keys marked required in the current scan
direction (forward).  Scan keys that didn't get this behavior (those
marked required in the backwards direction only) usually didn't give the
scan reasonable cause to reposition itself to a later leaf page (via
another descent of the index in _bt_first), but _bt_advance_array_keys
would nevertheless always give up by forcing another call to _bt_first.

_bt_advance_array_keys was unwilling to allow the scan to continue onto
the next leaf page, to reconsider whether we really should start another
primitive scan based on the details of the sibling page's tuples.  This
didn't match its behavior with similar cases involving keys required in
the current scan direction (forward), which seems unprincipled.  It led
to an excessive number of primitive scans/index descents for queries
with a higher-order = array scan key (with dense, contiguous values)
mixed with a lower-order required > or >= scan key.

Bring > and >= strategy scan keys in line with other required scan key
types: treat truncated -inf scan keys as having satisfied scan keys
required in either scan direction (forwards and backwards alike) during
array advancement.  That way affected scans can continue to the right
sibling leaf page.  Advancement must now schedule an explicit recheck of
the right sibling page's high key in cases involving > or >= scan keys.
The recheck gives the scan a way to back out and start another primitive
index scan (we can't just rely on _bt_checkkeys with > or >= scan keys).

This work can be considered a stand alone optimization on top of the
work from commit 5bf748b8.  But it was written in preparation for an
upcoming patch that will add skip scan to nbtree.  In practice scans
that use "skip arrays" will tend to be much more sensitive to any
implementation deficiencies in this area.

Author: Peter Geoghegan <[email protected]>
Reviewed-By: Tomas Vondra <[email protected]>
Discussion: https://postgr.es/m/CAH2-Wz=9A_UtM7HzUThSkQ+BcrQsQZuNhWOvQWK06PRkEp=SKQ@mail.gmail.com

Fix typo in comment of transformJsonAggConstructor()

An oversight of 3a8a1f3254b.

Reported-by: Tender Wang <[email protected]>
Author: Tender Wang <[email protected]>
Backpatch-through: 16

initdb: Change default to using data checksums.

Checksums are now on by default. They can be disabled by the
previously added option --no-data-checksums.

Author: Greg Sabino Mullane <[email protected]>
Reviewed-by: Nathan Bossart <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com

doc: Fix initdb option xreflabels

Generally, we don't want any overriding xreflabels in the options
list, so that we can link to options and the link renders as the
option name. The -g option did this differently and config.sgml made
use of that for a link. The new --no-data-checksums option (commit
983a588e0b8) apparently copied this pattern, but that seems like the
wrong direction, as a future patch revealed.

To fix, remove the two xreflabels and rewrite the link in config.sgml
with an explicit link text.

Add type cast to foreach_internal's loop variable.

C++ requires explicitly casting void pointers to the appropriate
pointer type, which means the foreach_ptr macro cannot be used in
C++ code without this change.

Author: Jelte Fennema-Nio
Reviewed-by: Bruce Momjian
Discussion: https://postgr.es/m/CAGECzQSYG3QfHrc-rOk2KbnB9iJOd7Qu-Xii1s-GTA%3D3JFt49Q%40mail.gmail.com
Backpatch-through: 17

Move clause_sides_match_join() into restrictinfo.h

Two near-identical copies of clause_sides_match_join() existed in
joinpath.c and analyzejoins.c. Deduplicate this by moving the function
into restrictinfo.h.

It isn't quite clear that keeping the inline property of this function
is worthwhile, but this commit is just an exercise in code
deduplication. More effort would be required to determine if the inline
property is worth keeping.

Author: James Hunter <[email protected]>
Discussion: https://postgr.es/m/CAJVSvF7Nm_9kgMLOch4c-5fbh3MYg%3D9BdnDx3Dv7Fcb64zr64Q%40mail.gmail.com

Add contrib/pg_logicalinspect.

This module provides SQL functions that allow to inspect logical
decoding components.

It currently allows to inspect the contents of serialized logical
snapshots of a running database cluster, which is useful for debugging
or educational purposes.

Author: Bertrand Drouvot
Reviewed-by: Amit Kapila, Shveta Malik, Peter Smith, Peter Eisentraut
Reviewed-by: David G. Johnston
Discussion: https://postgr.es/m/ZscuZ92uGh3wm4tW%40ip-10-97-1-34.eu-west-3.compute.internal

Move SnapBuild and SnapBuildOnDisk structs to snapshot_internal.h.

This commit moves the definitions of the SnapBuild and SnapBuildOnDisk
structs, related to logical snapshots, to the snapshot_internal.h
file. This change allows external tools, such as
pg_logicalinspect (with an upcoming patch), to access and utilize the
contents of logical snapshots.

Author: Bertrand Drouvot
Reviewed-by: Amit Kapila, Shveta Malik, Peter Smith
Discussion: https://postgr.es/m/ZscuZ92uGh3wm4tW%40ip-10-97-1-34.eu-west-3.compute.internal

ecpg: invent a saner syntax for ecpg.addons entries.

Put the rule type at the start not the end, and put spaces
between the constitutent token names instead of smashing them
into an illegible mess. This has no functional impact but
I think it makes the rules a great deal more readable.

Discussion: https://postgr.es/m/1185216.1724001216@sss.pgh.pa.us

Add commit 7f7474a8e4 to .git-blame-ignore-revs.

ecpg: add cross-checks to parse.pl for usage of internal tables.

parse.pl contains several constant tables that describe tweaks
to be made to the backend grammar. In the same spirit as
00b0e7204, add cross-checks that each table entry is used at
least once (or exactly once if that's appropriate). This should
help catch cases where adjustments to the backend grammar cause
a table entry not to match as expected.

Per suggestion from Michael Paquier.

Discussion: https://postgr.es/m/[email protected]

Move libc-specific code from pg_locale.c into pg_locale_libc.c.

Move implementation of pg_locale_t code for libc collations into
pg_locale_libc.c. Other locale-related code, such as
pg_perm_setlocale(), remains in pg_locale.c for now.

Discussion: https://postgr.es/m/flat/2830211e1b6e6a2e26d845780b03e125281ea17b [email protected]

ecpg: avoid breaking the IDENT precedence level in two.

Careless string hacking caused parse.pl to transform gram.y's
declaration

%nonassoc IDENT PARTITION RANGE ROWS ...

into

%nonassoc IDENT
%nonassoc CSTRING PARTITION RANGE ROWS ...

It turns out that this has no semantic impact, because the
generated preproc.c is exactly the same either way (if you
inject a blank line to keep line numbers the same).

Nonetheless, given the great emphasis that the commentary in
gram.y places on keeping those other keywords at the same
precedence level as IDENT, this seems like foolishly risking ecpg
behaving differently from the core parser. Adjust the code so
that CSTRING is added to the precedence line without breaking it
into two lines.

Discussion: https://postgr.es/m/2157151.1713540065@sss.pgh.pa.us

Move ICU-specific code from pg_locale.c into pg_locale_icu.c.

Discussion: https://postgr.es/m/flat/2830211e1b6e6a2e26d845780b03e125281ea17b [email protected]

ecpg: improve preprocessor's memory management.

Invent a notion of "local" storage that will automatically be
reclaimed at the end of each statement. Use this for location
strings as well as other visibly short-lived data within the parser.

Also, make cat_str and make_str return local storage and not free
their inputs, which allows dispensing with a whole lot of retail
mm_strdup calls. We do have to add some new ones in places where
a local-lifetime string needs to be added to a longer-lived data
structure, but on balance there are a lot less mm_strdup calls than
before.

In hopes of flushing out places where changes were necessary,
I changed YYLTYPE from "char *" to "const char *", which forced
const-ification of various function arguments that probably
should've been like that all along.

This still leaks somewhat more memory than v17, but that will be
cleaned up in future commits.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us

ecpg: move some functions into a new file ecpg/preproc/util.c.

mm_alloc and mm_strdup were in type.c, which seems a completely
random choice.  No doubt the original author thought two small
functions didn't deserve their own file.  But I'm about to add
some more memory-management stuff beside them, so let's put them
in a less surprising place.  This seems like a better home for
mmerror, mmfatal, and the cat_str/make_str family, too.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us

ecpg: re-implement preprocessor's string management.

Most productions in the preprocessor grammar construct strings
representing SQL or C statements or fragments thereof.  Instead
of returning these as <str> results of the productions, return
them as "location" values, taking advantage of Bison's flexibility
about what a location is.  We aren't really giving up anything
thereby, since ecpg's error reports have always just given line
numbers, and that's tracked separately.  The advantage of this
is that a single instance of the YYLLOC_DEFAULT macro can
perform all the work needed by the vast majority of productions,
including all the ones made automatically by parse.pl.  This
avoids having large numbers of effectively-identical productions,
which tickles an optimization inefficiency in recent versions of
clang.  (This patch reduces the compilation time for preproc.o
by more than 100-fold with clang 16, and is visibly helpful with
gcc too.)  The compiled parser is noticeably smaller as well.

A disadvantage of this approach is that YYLLOC_DEFAULT is applied
before running the production's semantic action (if any).  This
means it cannot use the method favored by cat_str() of free'ing
all the input strings; if the action needs to look at the input
strings, it'd be looking at dangling storage.  As this stands,
therefore, it leaks memory like a sieve.  This is already a big
patch though, and fixing the memory management seems like a
separable problem, so let's leave that for the next step.
(This does remove some free() calls that I'd have had to touch
anyway, in the expectation that the next step will manage
memory reclamation quite differently.)

Most of the changes here are mindless substitution of "@N" for
"$N" in grammar rules; see the changes to README.parser for
an explanation.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us

ecpg: major cleanup, simplification, and documentation of parse.pl.

Remove a lot of cruft, clean up and document what's left.
This produces the same preproc.y output as before, except for
fewer blank lines. (It's not like we're making any attempt to
match the layout of gram.y, so I removed the one bit of logic
that seemed to have that in mind.)

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us

ecpg: remove check_rules.pl.

As noted in the previous commit, check_rules.pl is now entirely
redundant with checks made by parse.pl, or would be if it weren't
for the places where it's wrong. It's a waste of build cycles
and maintenance effort, so remove it.

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us

ecpg: clean up documentation of parse.pl, and add more input checking.

README.parser is the user's manual, such as it is, for parse.pl.
It's rather poorly written if you ask me; so try to improve it.
(More could be written here, but this at least covers the same
info in a more organized fashion.)

Also, the single solitary line of usage info in parse.pl itself
was a lie.  Replace.

Add some error checks that the ecpg.addons entries meet the syntax
rules set forth in README.parser.  One of them didn't, but
accidentally worked anyway because the logic in include_addon is
such that 'block' is the default behavior.

Also add a cross-check that each ecpg.addons entry is matched exactly
once in the backend grammar.  This exposed that there are two dead
entries there --- they are dead because the %replace_types table in
parse.pl causes their nonterminals to be ignored altogether.
Removing them doesn't change the generated preproc.y file.

(This implies that check_rules.pl is completely worthless and should
be nuked: it adds build cycles and maintenance effort while failing
to reliably accomplish its one job of detecting dead rules.  I'll
do that separately.)

Discussion: https://postgr.es/m/2011420.1713493114@sss.pgh.pa.us

Remove obsolete comment in reorderbuffer.h.

Commit 9fab40ad32e changed ReorderBuffer to use Slab Context for
allocating ReorderBufferTXN entries instead of using a caching
mechanism. The txn->node is no longer used as an element of the list
of preallocated ReorderBufferTXNs.

Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAD21AoB1CTnX66Ji3zTCnjoPVC9OzYe0B6LygUHcxEB2RV-hFw%40mail.gmail.com

Use construct_array_builtin for FLOAT8OID instead of construct_array.

Commit d746021de1 introduced construct_array_builtin() for built-in
data types, but forgot some replacements linked to FLOAT8OID.

Author: Bertrand Drouvot
Reviewed-by: Peter Eisentraut
Discussion: https://postgr.es/m/CAD21AoCERkwmttY44dqUw%3Dm_9QCctu7W%2Bp6B7w_VqxRJA1Qq_Q%40mail.gmail.com

Track scan reversals in MergeJoin

The MergeJoin struct was tracking "mergeStrategies", which were an
array of btree strategy numbers, purely for the purpose of comparing
it later against btree strategies to determine if the scan direction
was forward or reverse. Change that. Instead, track
"mergeReversals", an array of bool, to indicate the same without an
unfortunate assumption that a strategy number refers specifically to a
btree strategy.

Author: Mark Dilger <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com

Track sort direction in SortGroupClause

Functions make_pathkey_from_sortop() and transformWindowDefinitions(),
which receive a SortGroupClause, were determining the sort order
(ascending vs. descending) by comparing that structure's operator
strategy to BTLessStrategyNumber, but could just as easily have gotten
it from the SortGroupClause object, if it had such a field, so add
one. This reduces the number of places that hardcode the assumption
that the strategy refers specifically to a btree strategy, rather than
some other index AM's operators.

Author: Mark Dilger <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com

Allow TAP tests to force checksums off when calling init()

TAP tests can write

    $node->init(no_data_checksums => 1);

to initialize a cluster explicitly without checksums.  Currently, this
is the default, but this change allows running all tests with
checksums enabled, like

    PG_TEST_INITDB_EXTRA_OPTS=--data-checksums meson test ...

And this also prepares the tests for when we switch the default to
checksums enabled.

The pg_checksums tests need to disable checksums so it can test its
own functionality of enabling checksums.  The amcheck/pg_amcheck tests
need to disable checksums because they manually introduce corruption
that they want to detect, but with checksums enabled, the checksum
verification will fail before they even get to their work.

Author: Greg Sabino Mullane <[email protected]>
Reviewed-by: Nathan Bossart <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com

Run pgperltidy on newly-added test code

From commit 85ec945b78 (but apparently not caught by 05d1b9b5c2).

doc: Add anchors for COPY format descriptions

When answering support questions online it's helpful to be able to
refer to the specific format by using an anchored link.

Author: Dagfinn Ilmari Mannsåker <[email protected]>
Discussion: https://postgr.es/m/[email protected]

Remove traces of BeOS.

Commit 15abc7788e6 tolerated namespace pollution from BeOS system
headers. Commit 44f902122 de-supported BeOS. Since that stuff didn't
make it into the Meson build system, synchronize by removing from
configure.

Author: Thomas Munro <[email protected]>
Reviewed-by: Peter Eisentraut <[email protected]>
Reviewed-by: Heikki Linnakangas <[email protected]>
Reviewed-by: Japin Li <[email protected]>
Reviewed-by: Tom Lane <[email protected]> (the idea, not the patch)
Discussion: https://postgr.es/m/ME3P282MB3166F9D1F71F787929C0C7E7B6312%40ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM

psql: Fix \watch when using interval values less than 1ms

Attempting to use an interval of time less than 1ms would cause \watch
to hang. This was confusing, so let's change the logic so as an
interval lower than 1ms behaves the same as 0.

Comments are added to mention that the internals of do_watch() had
better rely on "sleep_ms", the interval value in milliseconds. While on
it, this commit adds a test to check the behavior of interval values
less than 1ms.

\watch hanging for interval values less than 1ms existed before
6f9ee74d45aa, that has changed the code to support an interval value of
0.

Reported-by: Heikki Linnakangas
Author: Andrey M. Borodin, Michael Paquier
Discussion: https://postgr.es/m/88445e0e-3156-4b9d-afae-9a1a7b1631f6@iki.fi
Backpatch-through: 16

Fixup for pg_set_relation_stats().

Reported-by: Noriyoshi Shinoda
Discussion: https://postgr.es/m/DM4PR84MB17345E2DFF28A5557B7CBC3CEE7A2@DM4PR84MB1734.NAMPRD84.PROD.OUTLOOK.COM

Use MAX_PARALLEL_WORKER_LIMIT for max_parallel_maintenance_workers

max_parallel_maintenance_workers has been introduced in 9da0cc35284b,
and used a hardcoded limit of 1024 rather than this variable.

max_parallel_workers and max_parallel_workers_per_gather already used
MAX_PARALLEL_WORKER_LIMIT (1024) as their upper-bound since
6599c9ac3340.

Author: Matthias van de Meent
Reviewed-by: Zhang Mingli
Discussion: https://postgr.es/m/CAEze2WiCiJD+8Wig_wGPyn4vgdPjbnYXy2Rw+9KYi6izTMuP=w@mail.gmail.com

Correctly identify which EC members are computable at a plan node.

find_computable_ec_member() had the wrong mental model of what
its primary caller prepare_sort_from_pathkeys() would do with
the selected EquivalenceClass member expression.  We will not
compute the EC expression in a plan node atop the one returning
the passed-in targetlist; rather, the EC expression will be
computed as an additional column of that targetlist.  So any
Var or quasi-Var used in the given tlist is also available to the
EC expression.  In simple cases this makes no difference because
the given tlist is just a list of Vars or quasi-Vars --- but if
we are considering an appendrel member produced by flattening
a UNION ALL, the tlist may contain expressions, resulting in
failure to match and a "could not find pathkey item to sort"
error.

To fix, we can flatten both the tlist and the EC members with
pull_var_clause(), and then just check for subset-ness, so
that the code is actually shorter than before.

While this bug is quite old, the present patch only works back to
v13.  We could possibly make it work in v12 by back-patching parts
of 375398244.  On the whole though I don't like the risk/reward
ratio of that idea.  v12's final release is next month, meaning
there would be no chance to correct matters if the patch causes a
regression.  Since this failure has escaped notice for 14 years,
it's likely nobody will hit it in the field with v12.

Per bug #18652 from Alexander Lakhin.

Andrei Lepikhov and Tom Lane

Discussion: https://postgr.es/m/18652-deaa782ebcca85d1@postgresql.org

Fix missed case for builtin collation provider.

A missed check for the builtin collation provider could result in
falling through to call isalpha().

This does not appear to have practical consequences because it only
happens for characters in the ASCII range. Regardless, the builtin
provider should not be calling libc functions, so backpatch.

Discussion: https://postgr.es/m/1bd5a0a5192f82c22ee7527e825b18ab0028b2c7 [email protected]
Backpatch-through: 17

Create functions pg_set_relation_stats, pg_clear_relation_stats.

These functions are used to tweak statistics on any relation, provided
that the user has MAINTAIN privilege on the relation, or is the database
owner.

Bump catalog version.

Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=eErgzn7ECDpwFcptJKOk9SxZEk5Pot4d94eVTZsvj3gw@mail.gmail.com

Avoid mixing custom and OpenSSL BIO functions

PostgreSQL has for a long time mixed two BIO implementations, which can
lead to subtle bugs and inconsistencies. This cleans up our BIO by just
just setting up the methods we need. This patch does not introduce any
functionality changes.

The following methods are no longer defined due to not being needed:

  - gets: Not used by libssl
  - puts: Not used by libssl
  - create: Sets up state not used by libpq
  - destroy: Not used since libpq use BIO_NOCLOSE, if it was used it close
             the socket from underneath libpq
  - callback_ctrl: Not implemented by sockets

The following methods are defined for our BIO:

  - read: Used for reading arbitrary length data from the BIO. No change
          in functionality from the previous implementation.
  - write: Used for writing arbitrary length data to the BIO. No change
           in functionality from the previous implementation.
  - ctrl: Used for processing ctrl messages in the BIO (similar to ioctl).
          The only ctrl message which matters is BIO_CTRL_FLUSH used for
          writing out buffered data (or signal EOF and that no more data
          will be written). BIO_CTRL_FLUSH is mandatory to implement and
          is implemented as a no-op since there is no intermediate buffer
          to flush.
          BIO_CTRL_EOF is the out-of-band method for signalling EOF to
          read_ex based BIO's. Our BIO is not read_ex based but someone
          could accidentally call BIO_CTRL_EOF on us so implement mainly
          for completeness sake.

As the implementation is no longer related to BIO_s_socket or calling
SSL_set_fd, methods have been renamed to reference the PGconn and Port
types instead.

This also reverts back to using BIO_set_data, with our fallback, as a small
optimization as BIO_set_app_data require the ex_data mechanism in OpenSSL.

Author: David Benjamin <[email protected]>
Reviewed-by: Andres Freund <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discussion: https://postgr.es/m/CAF8qwaCZ97AZWXtg_y359SpOHe+HdJ+p0poLCpJYSUxL-8Eo8A@mail.gmail.com

Add pg_ls_summariesdir().

This function returns the name, size, and last modification time of
each regular file in pg_wal/summaries. This allows administrators
to grant privileges to view the contents of this directory without
granting privileges on pg_ls_dir(), which allows listing the
contents of many other directories. This commit also gives the
pg_monitor predefined role EXECUTE privileges on the new
pg_ls_summariesdir() function.

Bumps catversion.

Author: Yushi Ogiwara
Reviewed-by: Michael Paquier, Fujii Masao
Discussion: https://postgr.es/m/a0a3af15a9b9daa107739eb45aa9a9bc%40oss.nttdata.com

Mark consume_xids test functions VOLATILE and PARALLEL UNSAFE

Both functions advance the transaction ID, which modifies the system
state. Thus, they should be marked as VOLATILE.

Additionally, they call the AssignTransactionId function, which cannot
be invoked in parallel mode, so they should be marked as PARALLEL
UNSAFE.

Author: Yushi Ogiwara <[email protected]>
Discussion: https://www.postgresql.org/message-id/18f01e4fd46448f88c7a1363050a9955@oss.nttdata.com

Fix typo in connection limits test

Spotted while doing post-commit review.

Use deconstruct_array_builtin instead of deconstruct_array

Commit 062a84442424 introduced use of deconstruct_array when
deconstruct_array_builtin can be used instead. Do that to save some
code.

Author: Bertrand Drouvot <[email protected]>
Discussion: https://postgr.es/m/[email protected]

pgbench: Improve result outputs related to failed transactions.

Previously, per-script statistics were never output when all
transactions failed due to serialization or deadlock errors. However,
it is reasonable to report such information if there are ones even
when there are no successful transaction since these failed
transactions are now objects to be reported.

Meanwhile, if the total number of successful, skipped, and failed
transactions is zero, we don't have to report the number of failed
transactions as similar to the number of skipped transactions, which
avoids to print "NaN%" in lines on failed transaction reports.

Also, the number of transactions in per-script results now includes
skipped and failed transactions. It prevents to print "total of NaN%"
when any transactions are not successfully processed. The number of
transactions actually processed per-script and TPS based on it are now
output explicitly in a separate line.

Author: Yugo Nagata
Reviewed-by: Tatsuo Ishii
Discussion: https://postgr.es/m/20240921003544.2436ef8da9c5c8cb963c651b%40sraoss.co.jp

Adjust EXPLAIN's output for disabled nodes

c01743aa4 added EXPLAIN output to display the plan node's disabled_node
count whenever that count is above 0.  Seemingly, there weren't many
people who liked that output as each parent of a disabled node would
also have a "Disabled Nodes" output due to the way disabled_nodes is
accumulated towards the root plan node.  It was often hard and sometimes
impossible to figure out which nodes were disabled from looking at
EXPLAIN.  You might think it would be possible to manually add up the
numbers from the "Disabled Nodes" output of a given node's children to
figure out if that node has a higher disabled_nodes count than its
children, but that wouldn't have worked for Append and Merge Append nodes
if some disabled child nodes were run-time pruned during init plan.  Those
children are not displayed in EXPLAIN.

Here we attempt to improve this output by only showing "Disabled: true"
against only the nodes which are explicitly disabled themselves.  That
seems to be the output that's desired by the most people who voiced
their opinion.  This is done by summing up the disabled_nodes of the
given node's children and checking if that number is less than the
disabled_nodes of the current node.

This commit also fixes a bug in make_sort() which was neglecting to set
the Sort's disabled_nodes field.  This should have copied what was done
in cost_sort(), but it hadn't been updated.  With the new output, the
choice to not maintain that field properly was clearly wrong as the
disabled-ness of the node was attributed to the Sort's parent instead.

Reviewed-by: Laurenz Albe, Alena Rybakina
Discussion: https://postgr.es/m/9e4ad616bebb103ec2084bf6f724cfc739e7fabb [email protected]

Don't hard-code the input file name in gen_tabcomplete.pl's output.

Use $ARGV[0], that is the specified input file name, in #line
directives generated by gen_tabcomplete.pl. This makes code
coverage reports work properly in the meson build system (where
the input file name will be a relative path).

Also fix up brain fade in the meson build rule for tab-complete.c:
we only need to write the input file name once not twice.

Jacob Champion (some cosmetic adjustments by me)

Discussion: https://postgr.es/m/CAOYmi+=+oWAoi8pqnH0MJQqsSn4ddzqDhqRQJvyiN2aJSWvw2w@mail.gmail.com

Avoid possible segfault in psql's tab completion.

Fix oversight in bd1276a3c: the "words_after_create" stanza in
psql_completion() requires previous_words_count > 0, since it uses
prev_wd. This condition was formerly assured by the if-else chain
above it, but no more. If there were no previous words then we'd
dereference an uninitialized pointer, possibly causing a segfault.

Report and patch by Anthonin Bonnefoy.

Discussion: https://postgr.es/m/CAO6_XqrSRE7c_i+D7Hm07K3+6S0jTAmMr60RY41XzaA29Ae5uA@mail.gmail.com

Unbreak overflow test for attinhcount/coninhcount

Commit 90189eefc1e1 narrowed pg_attribute.attinhcount and
pg_constraint.coninhcount from 32 to 16 bits, but kept other related
structs with 32-bit wide fields: ColumnDef and CookedConstraint contain
an int 'inhcount' field which is itself checked for overflow on
increments, but there's no check that the values aren't above INT16_MAX
before assigning to the catalog columns. This means that a creative
user can get a inconsistent table definition and override some
protections.

Fix it by changing those other structs to also use int16.

Also, modernize style by using pg_add_s16_overflow for overflow testing
instead of checking for negative values.

We also have Constraint.inhcount, which is here removed completely.
This was added by commit b0e96f311985 and not removed by its revert at
6f8bb7c1e961. It is not needed by the upcoming not-null constraints
patch.

This is mostly academic, so we agreed not to backpatch to avoid ABI
problems.

Bump catversion because of the changes to parse nodes.

Co-authored-by: Álvaro Herrera <[email protected]>
Co-authored-by: 何建 (jian he) <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Discussion: https://postgr.es/m/202410081611 [email protected]

Improve descriptions of some pg_stat_checkpoints functions in pg_proc.dat.

Previously, the descriptions of pg_stat_get_checkpointer_num_requested(),
pg_stat_get_checkpointer_restartpoints_requested(),
and pg_stat_get_checkpointer_restartpoints_performed() in pg_proc.dat
referred to "backend". This was misleading because these functions report
the number of checkpoints or restartpoints requested or performed
by other than backends as well.

This commit removes "backend" from these descriptions to avoid confusion.

Bump catalog version.

Idea from Anton A. Melnikov
Author: Fujii Masao
Reviewed-by: Anton A. Melnikov
Discussion: https://postgr.es/m/8e5f353f-8b31-4a8e-9cfa-c037f22b4aee@postgrespro.ru

Avoid crash in estimate_array_length with null root pointer.

Commit 9391f7152 added a "PlannerInfo *root" parameter to
estimate_array_length, but failed to consider the possibility that
NULL would be passed for that, leading to a null pointer dereference.

We could rectify the particular case shown in the bug report by fixing
simplify_function/inline_function to pass through the root pointer.
However, as long as eval_const_expressions is documented to accept
NULL for root, similar hazards would remain. For now, let's just do
the narrow fix of hardening estimate_array_length to not crash.
Its behavior with NULL root will be the same as it was before
9391f7152, so this is not too awful.

Per report from Fredrik Widlert (via Paul Ramsey). Back-patch to v17
where 9391f7152 came in.

Discussion: https://postgr.es/m/518339E7-173E-45EC-A0FF-9A4A62AA4F40@cleverelephant.ca

Apply GUC name from central table in more places of guc.c

The name extracted from the record of the GUC tables is applied to more
internal places of guc.c. This change has the advantage to simplify
parse_and_validate_value(), where the "name" was only used in elog
messages, while it was required to match with the name from the GUC
record.

pg_parameter_aclcheck() now passes the name of the GUC from its record
in two places rather than the caller's argument. The value given to
this function goes through convert_GUC_name_for_parameter_acl() that
does a simple ASCII downcasing.

Few GUCs mix character casing in core; one test is added for one of
these code paths with "IntervalStyle".

Author: Peter Smith, Michael Paquier
Discussion: https://postgr.es/m/[email protected]

Allow pushdown of HAVING clauses with grouping sets

In some cases, we may want to transfer a HAVING clause into WHERE in
hopes of eliminating tuples before aggregation instead of after.

Previously, we couldn't do this if there were any nonempty grouping
sets, because we didn't have a way to tell if the HAVING clause
referenced any columns that were nullable by the grouping sets, and
moving such a clause into WHERE could potentially change the results.

Now, with expressions marked nullable by grouping sets with the RT
index of the RTE_GROUP RTE, it is much easier to identify those
clauses that reference any nullable-by-grouping-sets columns: we just
need to check if the RT index of the RTE_GROUP RTE is present in the
clause. For other HAVING clauses, they can be safely pushed down.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-NpzPgtKU=hgnvyn+J-GanxQCjrUi7piNzZ=upiCV=2Q@mail.gmail.com

Consider explicit incremental sort for mergejoins

For a mergejoin, if the given outer path or inner path is not already
well enough ordered, we need to do an explicit sort.  Currently, we
only consider explicit full sort and do not account for incremental
sort.

In this patch, for the outer path of a mergejoin, we choose to use
explicit incremental sort if it is enabled and there are presorted
keys.  For the inner path, though, we cannot use incremental sort
because it does not support mark/restore at present.

The rationale is based on the assumption that incremental sort is
always faster than full sort when there are presorted keys, a premise
that has been applied in various parts of the code.  In addition, the
current cost model tends to favor incremental sort as being cheaper
than full sort in the presence of presorted keys, making it reasonable
not to consider full sort in such cases.

It could be argued that what if a mergejoin with an incremental sort
as the outer path is selected as the inner path of another mergejoin.
However, this should not be a problem, because mergejoin itself does
not support mark/restore either, and we will add a Material node on
top of it anyway in this case (see final_cost_mergejoin).

There is one ensuing plan change in the regression tests, and we have
to modify that test case to ensure that it continues to test what it
is intended to.

No backpatch as this could result in plan changes.

Author: Richard Guo
Reviewed-by: David Rowley, Tomas Vondra
Discussion: https://postgr.es/m/CAMbWs49x425QrX7h=Ux05WEnt8GS757H-jOP3_xsX5t1FoUsZw@mail.gmail.com

Remove incorrect function import from pgindent

Commit 149ac7d4559 which re-implemented pgindent in Perl explicitly
imported the devnull function from File::Spec, but the module does
not export anything. In recent versions of Perl calling a missing
import function cause a warning, which combined with warnings being
fatal cause pgindent to error out.

Backpatch to all supported versions.

Author: Erik Wienhold <[email protected]>
Reviewed-by: Andrew Dunstan <[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>
Discusson: https://postgr.es/m/2372cd74-11b0-46f9-b28e-8f9627215d19@ewie.name
Backpatch-through: v12

Allow roles created by new test to log in under SSPI.

Semi-blind attempt to fix 6a1d0d470 to work on Windows,
along the same lines as a70f2a57f. Per buildfarm.

pg_stat_statements: Add columns to track parallel worker activity

The view pg_stat_statements gains two columns:
- parallel_workers_to_launch, the number of parallel workers planned to
be launched.
- parallel_workers_launched, the number of parallel workers actually
launched.

The ratio of both columns offers hints that parallel workers are lacking
on a per-statement basis, requiring some tuning, in coordination with
"calls", the number of times a query is executed.

As of now, these numbers are tracked within Gather and GatherMerge
nodes. They could be extended to utilities that make use of parallel
workers (parallel btree and brin, VACUUM).

The module is bumped to 1.12.

Author: Guillaume Lelarge
Discussion: https://postgr.es/m/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X+_dZqKw@mail.gmail.com

Introduce two fields in EState to track parallel worker activity

These fields can be set by executor nodes to record how many parallel
workers were planned to be launched and how many of them have been
actually launched within the number initially planned. This data is
able to give an approximation of the parallel worker draught a system
is facing, making easier the tuning of related configuration parameters.

These fields will be used by some follow-up patches to populate other
parts of the system with their data.

Author: Guillaume Lelarge, Benoit Lobréau
Discussion: https://postgr.es/m/783bc7f7-659a-42fa-99dd-ee0565644e25@dalibo.com
Discussion: https://postgr.es/m/CAECtzeWtTGOK0UgKXdDGpfTVSa5bd_VbUt6K6xn8P7X+_dZqKw@mail.gmail.com

Silence assorted annoying test output.

Remove unnecessary chatter about "checking if IO::Socket::UNIX works";
our tests should never print anything on stderr unless there's a
problem.

Add .gitignore entry for temporary directory now being left behind
in src/test/postmaster.

Add min and max aggregates for bytea type.

Similar to a0f1fce80, although we chose to duplicate logic
rather than invoke byteacmp, primarily to avoid repeat detoasting.

Marat Buharov, Aleksander Alekseev

Discussion: https://postgr.es/m/CAPCEVGXiASjodos4P8pgyV7ixfVn-ZgG9YyiRZRbVqbGmfuDyg@mail.gmail.com

Use aux process resource owner in walsender

AIO will need a resource owner to do IO. Right now we create a resowner
on-demand during basebackup, and we could do the same for AIO. But it seems
easier to just always create an aux process resowner.

Reviewed-by: Heikki Linnakangas <[email protected]>
Reviewed-by: Noah Misch <[email protected]>
Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi

bufmgr/smgr: Don't cross segment boundaries in StartReadBuffers()

With real AIO it doesn't make sense to cross segment boundaries with one
IO. Add smgrmaxcombine() to allow upper layers to query which buffers can be
merged.

We could continue to cross segment boundaries when not using AIO, but it
doesn't really make sense, because md.c will never be able to perform the read
across the segment boundary in one system call. Which means we'll mark more
buffers as undergoing IO than really makes sense - if another backend desires
to read the same blocks, it'll be blocked longer than necessary. So it seems
better to just never cross the boundary.

Reviewed-by: Heikki Linnakangas <[email protected]>
Reviewed-by: Noah Misch <[email protected]>
Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi

bufmgr: Return early in ScheduleBufferTagForWriteback() if fsync=off

As pg_flush_data() doesn't do anything with fsync disabled, there's no point
in tracking the buffer for writeback. Arguably the better fix would be to
change pg_flush_data() to flush data even with fsync off, but that's a
behavioral change, whereas this is just a small optimization.

Reviewed-by: Heikki Linnakangas <[email protected]>
Reviewed-by: Noah Misch <[email protected]>
Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi

Silence buildfarm warning chatter from bd1276a3c.

Buildfarm members using -Wextra complained about "warning: suggest
braces around empty body in an 'if' statement". Do it gcc's way,
though I see no actual readability benefit in this.

Fix typo and run pgperltidy on newly-added test

From commit 85ec945b78.

Use an shmem_exit callback to remove backend from PMChildFlags on exit

This seems nicer than having to duplicate the logic between
InitProcess() and ProcKill() for which child processes have a
PMChildFlags slot.

Move the MarkPostmasterChildActive() call earlier in InitProcess(),
out of the section protected by the spinlock.

Reviewed-by: Andres Freund <[email protected]>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi

Add test for dead-end backends

The code path for launching a dead-end backend because we're out of
slots was not covered by any tests, so add one. (Some tests did hit
the case of launching a dead-end backend because the server is still
starting up, though, so the gap in our test coverage wasn't as big as
it sounds.)

Reviewed-by: Andres Freund <[email protected]>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi

Add test for connection limits

Reviewed-by: Andres Freund <[email protected]>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi

Doc: add check to detect non-breaking spaces in the docs.

There were multiple instances where accidentally adding non-breaking
space (nbsp, U+00A0, 0xc2a0 in UTF-8) to sgml files. This commit adds
additional checking to detect nbsp. You can check the nbsp by:

make -C doc/src/sgml check

or

make -C doc/src/sgml check-nbsp

Authors: Yugo Nagata, Daniel Gustafsson
Reviewed-by: Tatsuo Ishii, Daniel Gustafsson
Discussion: https://postgr.es/m/20240930.153404.202479334310259810.ishii%40postgresql.org

Move check for binary mode and on_error option to the appropriate location.

Commit 9e2d870119 placed the check for binary mode and on_error
before default values were inserted, which was not ideal.
This commit moves the check to a more appropriate position
after default values are set.

Additionally, the comment incorrectly mentioned two checks before
inserting defaults, when there are actually three. This commit corrects
that comment.

Author: Atsushi Torikoshi
Reviewed-by: Fujii Masao
Discussion: https://postgr.es/m/8830518a-28ac-43a2-8a11-1676d9a3cdf8@oss.nttdata.com

Add REJECT_LIMIT option to the COPY command.

Previously, when ON_ERROR was set to 'ignore', the COPY command
would skip all rows with data type conversion errors, with no way to
limit the number of skipped rows before failing.

This commit introduces the REJECT_LIMIT option, allowing users to
specify the maximum number of erroneous rows that can be skipped.
If more rows encounter data type conversion errors than allowed by
REJECT_LIMIT, the COPY command will fail with an error, even when
ON_ERROR = 'ignore'.

Author: Atsushi Torikoshi
Reviewed-by: Junwang Zhao, Kirill Reshke, jian he, Fujii Masao
Discussion: https://postgr.es/m/63f99327aa6b404cc951217fa3e61fe4@oss.nttdata.com

Stabilize the test added by commit 022564f60c.

The test was unstable in branches 14 and 15 as we were relying on the
number of changes in the table having a toast column to start streaming.
On branches >= 16, we have a GUC debug_logical_replication_streaming which
can stream each change, so the test was stable in those branches.

Change the test to use PREPARE TRANSACTION as that should make the result
consistent and test the code changed in 022564f60c.

Reported-by: Daniel Gustafsson as per buildfarm
Author: Hou Zhijie, Amit Kapila
Backpatch-through: 14
Discussion: https://postgr.es/m/8C2F86AA-981E-4803-B14D-E264C0255330@yesql.se

Improve style of two code paths

In execGrouping.c, execTuplesMatchPrepare() was doing a memory
allocation that was not necessary when the number of columns was 0.
In foreign.c, pg_options_to_table() was assigning twice a variable to
the same value.

Author: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQAqup0agbSzMjSLSTn=OANyCzxENF1+HrSYnr3WyZib7=Q@mail.gmail.com

Fix search_path cache initialization.

The cache needs to be available very early, so don't rely on
InitializeSearchPath() to initialize the it.

Reported-by: Murat Efendioğlu
Discussion: https://postgr.es/m/CACbCzujQ4zS8MM1bx-==+tr+D3Hk5G1cjN4XkUQ+Q=cEpwhzqg@mail.gmail.com
Backpatch-through: 17

Fix test for password hash length limit.

In commit 8275325a06, I forgot to update password_1.out (an
alternative expected test output file added by commit 3c44e7d8d4),
so this test began failing on machines with FIPS mode enabled.

vacuumdb: Schema-qualify operator in catalog query's WHERE clause.

Commit 1ab67c9dfa, which modified this catalog query so that it
doesn't return temporary relations, forgot to schema-qualify the
operator. A comment earlier in the function implores us to fully
qualify everything in the query:

* Since we execute the constructed query with the default search_path
* (which could be unsafe), everything in this query MUST be fully
* qualified.

This commit fixes that. While at it, add a newline for consistency
with surrounding code.

Reviewed-by: Noah Misch
Discussion: https://postgr.es/m/ZwQJYcuPPUsF0reU%40nathan
Backpatch-through: 12

Fix Y2038 issues with MyStartTime.

Several places treat MyStartTime as a "long", which is only 32 bits
wide on some platforms.  In reality, MyStartTime is a pg_time_t,
i.e., a signed 64-bit integer.  This will lead to interesting bugs
on the aforementioned systems in 2038 when signed 32-bit integers
are no longer sufficient to store Unix time (e.g., "pg_ctl start"
hanging).  To fix, ensure that MyStartTime is handled as a 64-bit
value everywhere.  (Of course, users will need to ensure that
time_t is 64 bits wide on their system, too.)

Co-authored-by: Max Johnson
Discussion: https://postgr.es/m/CO1PR07MB905262E8AC270FAAACED66008D682%40CO1PR07MB9052.namprd07.prod.outlook.com
Backpatch-through: 12

Convert tab-complete's long else-if chain to a switch statement.

Rename tab-complete.c to tab-complete.in.c, create the preprocessor
script gen_tabcomplete.pl, and install Makefile/meson.build rules
to create tab-complete.c from tab-complete.in.c.  The preprocessor
converts match_previous_words' else-if chain into a switch and
populates tcpatterns[] with the data needed by the driver loop.

The initial HeadMatches/TailMatches/Matches test in each else-if arm
is now performed in a table-driven loop.  Where we get a match, the
corresponding switch case is invoked to see if the match succeeds.
(It might not, if there were additional conditions in the original
else-if test.)

The total number of string comparisons done is just about the
same as it was in the previous coding; however, now that we
have table-driven logic underlying the handmade rules, there
is room to improve that.  For now I haven't bothered because
tab completion is still plenty fast enough for human use.
If the number of rules keeps increasing, we might someday
need to do more in that area.

The immediate benefit of all this thrashing is that C compilers
frequently don't deal well with long else-if chains.  On gcc 8.5.0,
this reduces the compile time of tab-complete.c by about a factor of
four, while MSVC is reported to crash outright with the previous
coding.

Discussion: https://postgr.es/m/2208466.1720729502@sss.pgh.pa.us

Prepare tab-complete.c for preprocessing.

Separate out psql_completion's giant else-if chain of *Matches
tests into a new function.  Add the infrastructure needed for
table-driven checking of the initial match of each completion
rule.  As-is, however, the code continues to operate as it did.
The new behavior applies only if SWITCH_CONVERSION_APPLIED
is #defined, which it is not here.  (The preprocessor added
in the next patch will add a #define for that.)

The first and last couple of bits of psql_completion are not
based on HeadMatches/TailMatches/Matches tests, so they stay
where they are; they won't become part of the switch.

This patch also fixes up a couple of if-conditions that didn't meet
the conditions enumerated in the comment for match_previous_words().
Those restrictions exist to simplify the preprocessor.

Discussion: https://postgr.es/m/2208466.1720729502@sss.pgh.pa.us

Invent "MatchAnyN" option for tab-complete.c's Matches/MatchesCS.

This argument matches any number (including zero) of previous words.
Use it to replace the common coding pattern

if (HeadMatches("A", "B") && TailMatches("X", "Y"))

with

if (Matches("A", "B", MatchAnyN, "X", "Y"))

In itself this feature doesn't do much except (arguably) make the
code slightly shorter and more readable.  However, it reduces the
number of complex if-condition patterns that have to be dealt with
in the next commits in this series.

While here, restructure the *Matches implementation functions so
that the actual work is done in functions that take a char **
array of pattern strings, and the versions taking variadic arguments
are thin wrappers around the array ones.  This simplifies the
new Matches logic considerably.  At the end of this patch series,
the array functions will be the only ones that are material to
performance, so having the variadic ones be wrappers makes sense.

Discussion: https://postgr.es/m/2208466.1720729502@sss.pgh.pa.us

Restrict password hash length.

Commit 6aa44060a3 removed pg_authid's TOAST table because the only
varlena column is rolpassword, which cannot be de-TOASTed during
authentication because we haven't selected a database yet and
cannot read pg_class.  Since that change, attempts to set password
hashes that require out-of-line storage will fail with a "row is
too big" error.  This error message might be confusing to users.

This commit places a limit on the length of password hashes so that
attempts to set long password hashes will fail with a more
user-friendly error.  The chosen limit of 512 bytes should be
sufficient to avoid "row is too big" errors independent of BLCKSZ,
but it should also be lenient enough for all reasonable use-cases
(or at least all the use-cases we could imagine).

Reviewed-by: Tom Lane, Jonathan Katz, Michael Paquier, Jacob Champion
Discussion: https://postgr.es/m/89e8649c-eb74-db25-7945-6d6b23992394%40gmail.com

Fix fetching default toast value during decoding of in-progress transactions.

During logical decoding of in-progress transactions, we perform the toast
table scan while fetching the default toast value for an attribute. We
forgot to initialize the flag during this scan to indicate that the system
table scan is in progress. We need this flag to ensure that during logical
decoding we never directly access the tableam or heap APIs because we check
for concurrent aborts only in systable_* APIs.

Reported-by: Alexander Lakhin
Author: Takeshi Ideriha, Hou Zhijie
Reviewed-by: Amit Kapila, Hou Zhijie
Backpatch-through: 14
Discussion: https://postgr.es/m/18641-6687273b7f15269d@postgresql.org

doc: Quote value in SET NAMES documentation

The value passed to SET NAMES should be wrapped in single quotes.

Reported-by: jian he <[email protected]>
Discussion: https://postgr.es/m/CACJufxG3EoUsbX4ZoMFkWrvBJcSCbPjdpRvPhuQN65fADc3mFg@mail.gmail.com

doc: Add minimal C and SQL example to add a custom table AM handler

The documentation was rather sparse on this matter and there is no
extension in-core that shows how to do it. Adding a small example will
hopefully help newcomers. An advantage of writing things this way is
that the contents are not going to rot because of backend changes.

Author: Phil Eaton
Reviewed-by: Robert Haas, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CAByiw+r+CS-ojBDP7Dm=9YeOLkZTXVnBmOe_ajK=en8C_zB3_g@mail.gmail.com

Use camel case for "DateStyle" in some error messages

This GUC is written as camel-case in most of the documentation and the
GUC table (but not postgresql.conf.sample), and two error messages
hardcoded it with lower case characters. Let's use a style more
consistent.

Most of the noise comes from the regression tests, updated to reflect
the GUC name in these error messages.

Author: Peter Smith
Reviewed-by: Peter Eisentraut, Álvaro Herrera
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com

Ignore not-yet-defined Portals in pg_cursors view.

pg_cursor() supposed that any Portal it finds in the hash table must
have sourceText set up, but there's an edge case where that is not so.
A newly-created Portal has sourceText = NULL, and that doesn't change
until PortalDefineQuery is called.  In SPI_cursor_open_internal,
we perform GetCachedPlan between CreatePortal and PortalDefineQuery,
and it's possible for user-defined code to execute during that
planning and cause a fetch from the pg_cursors view, resulting in a
null-pointer-dereference crash.  (It looks like the same could happen
in exec_bind_message, but I've not tried to provoke a failure there.)

I considered trying to fix this by setting sourceText sooner, but
there may be instances of this same calling pattern in extensions,
and we couldn't be sure they'd get the memo promptly.  It seems
better to redefine pg_cursor as not showing Portals that have
not yet had PortalDefineQuery called on them, which we can do by
just skipping them if sourceText is still NULL.

(Before a1c692358, pg_cursor would instead return a row with NULL
in the statement column.  We could revert to that behavior but it
doesn't really seem like a better definition, especially since our
documentation doesn't suggest that the column could be NULL.)

Per report from PetSerAl.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAKygsHTBXLXjwV43kpZa+Cs+XTiaeeJiZdL4cPBm9f4MTdw7wg@mail.gmail.com

Move Cluster.pm initialization code to a more obvious place

Commit 460c0076e8 added some module intialization code to set signal
handlers. However, that code has now become somewhat buried, as later
commits added new subroutines. Therefore, move the initialization code
to the module's INIT block where it won't become obscured.

libpq: Discard leading and trailing spaces for parameters and values in URIs

Integer values applied a parsing rule through pqParseIntParam() that
made URIs like this one working, even if these include spaces around
values:
"postgresql://localhost:5432/postgres?keepalives=1 &keepalives_idle=1 "

This commit changes the parsing so as spaces before and after parameters
and values are discarded, offering more consistency with the parsing
that already applied to libpq for integer values in URIs.

Note that %20 can be used in a URI for a space character. ECPGconnect()
has been discarded leading and trailing spaces around parameters and
values that for a long time, as well. Like f22e84df1dea, this is done
as a HEAD-only change.

Reviewed-by: Yuto Sasaki
Discussion: https://postgr.es/m/[email protected]

Use generateClonedIndexStmt to propagate CREATE INDEX to partitions.

When instantiating an existing partitioned index for a new child
partition, we use generateClonedIndexStmt to build a suitable
IndexStmt to pass to DefineIndex.  However, when DefineIndex needs
to recurse to instantiate a newly created partitioned index on an
existing child partition, it was doing copyObject on the given
IndexStmt and then applying a bunch of ad-hoc fixups.  This has
a number of problems, primarily that it implies fresh lookups of
referenced objects such as opclasses and collations.  Since commit
2af07e2f7 caused DefineIndex to restrict search_path internally, those
lookups could fail or deliver different results than the original one.
We can avoid those problems and save a few dozen lines of code by
using generateClonedIndexStmt in this code path too.

Another thing this fixes is incorrect propagation of parent-index
comments to child indexes (because the copyObject approach copies
the idxcomment field while generateClonedIndexStmt doesn't).  I had
noticed this in connection with commit c01eb619a, but not run the
problem to ground.

I'm tempted to back-patch this further than v17, but the only thing
it's known to fix in older branches is the comment issue, which is
pretty minor and doesn't seem worth the risk of introducing new
issues in stable branches.  (If anyone does care about that,
clearing idxcomment in the copied IndexStmt would be a safer fix.)

Per bug #18637 from usamoi.  Back-patch to v17 where the search_path
change came in.

Discussion: https://postgr.es/m/18637-f51e314546e3ba2a@postgresql.org