summaryrefslogtreecommitdiff
path: root/src/backend/replication
AgeCommit message (Collapse)Author
2021-01-25Fix two typos in snapbuild.c.Andres Freund
Reported-by: Heikki Linnakangas <[email protected]> Discussion: https://postgr.es/m/[email protected]
2021-01-25Remove CheckpointLock.Robert Haas
Up until now, we've held this lock when performing a checkpoint or restartpoint, but commit 076a055acf3c55314de267c62b03191586d79cf6 back in 2004 and commit 7e48b77b1cebb9a43f9fdd6b17128a0ba36132f9 from 2009, taken together, have removed all need for this. In the present code, there's only ever one process entitled to attempt a checkpoint: either the checkpointer, during normal operation, or the postmaster, during single-user operation. So, we don't need the lock. One possible concern in making this change is that it means that a substantial amount of code where HOLD_INTERRUPTS() was previously in effect due to the preceding LWLockAcquire() will now be running without that. This could mean that ProcessInterrupts() gets called in places from which it didn't before. However, this seems unlikely to do very much, because the checkpointer doesn't have any signal mapped to die(), so it's not clear how, for example, ProcDiePending = true could happen in the first place. Similarly with ClientConnectionLost and recovery conflicts. Also, if there are any such problems, we might want to fix them rather than reverting this, since running lots of code with interrupt handling suspended is generally bad. Patch by me, per an inquiry by Amul Sul. Review by Tom Lane and Michael Paquier. Discussion: http://postgr.es/m/CAAJ_b97XnBBfYeSREDJorFsyoD1sHgqnNuCi=02mNQBUMnA=FA@mail.gmail.com
2021-01-25Fix ALTER PUBLICATION...DROP TABLE behavior.Amit Kapila
Commit 69bd60672 fixed the initialization of streamed transactions for RelationSyncEntry. It forgot to initialize the publication actions while invalidating the RelationSyncEntry due to which even though the relation is dropped from a particular publication we still publish its changes. Fix it by initializing pubactions when entry got invalidated. Author: Japin Li and Bharath Rupireddy Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CALj2ACV+0UFpcZs5czYgBpujM9p0Hg1qdOZai_43OU7bqHU_xw@mail.gmail.com
2021-01-19pgindent worker.c.Amit Kapila
This is a leftover from commit 0926e96c49. Changing this separately because this file is being modified for upcoming patch logical replication of 2PC. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+Ps+EgG8KzcmAyAgBUi_vuTps6o9ZA8DG6SdnO0-YuOhPQ@mail.gmail.com
2021-01-18Narrow the scope of a local variable.Tom Lane
This is better style and more symmetrical with the other if-branch. This likely should have been included in 9de77b545 (which created the opportunity), but it was overlooked. Japin Li Discussion: https://postgr.es/m/MEYP282MB16699FA4A7CD57EB250E871FB6A40@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
2021-01-14Ensure that a standby is able to follow a primary on a newer timeline.Fujii Masao
Commit 709d003fbd refactored WAL-reading code, but accidentally caused WalSndSegmentOpen() to fail to follow a timeline switch while reading from a historic timeline. This issue caused a standby to fail to follow a primary on a newer timeline when WAL archiving is enabled. If there is a timeline switch within the segment, WalSndSegmentOpen() should read from the WAL segment belonging to the new timeline. But previously since it failed to follow a timeline switch, it tried to read the WAL segment with old timeline. When WAL archiving is enabled, that WAL segment with old timeline doesn't exist because it's renamed to .partial. This leads a primary to have tried to read non-existent WAL segment, and which caused replication to faill with the error "ERROR: requested WAL segment ... has already been removed". This commit fixes WalSndSegmentOpen() so that it's able to follow a timeline switch, to ensure that a standby is able to follow a primary on a newer timeline even when WAL archiving is enabled. This commit also adds the regression test to check whether a standby is able to follow a primary on a newer timeline when WAL archiving is enabled. Back-patch to v13 where the bug was introduced. Reported-by: Kyotaro Horiguchi Author: Kyotaro Horiguchi, tweaked by Fujii Masao Reviewed-by: Alvaro Herrera, Fujii Masao Discussion: https://postgr.es/m/[email protected]
2021-01-14Rework refactoring of hex and encoding routinesMichael Paquier
This commit addresses some issues with c3826f83 that moved the hex decoding routine to src/common/: - The decoding function lacked overflow checks, so when used for security-related features it was an open door to out-of-bound writes if not carefully used that could remain undetected. Like the base64 routines already in src/common/ used by SCRAM, this routine is reworked to check for overflows by having the size of the destination buffer passed as argument, with overflows checked before doing any writes. - The encoding routine was missing. This is moved to src/common/ and it gains the same overflow checks as the decoding part. On failure, the hex routines of src/common/ issue an error as per the discussion done to make them usable by frontend tools, but not by shared libraries. Note that this is why ECPG is left out of this commit, and it still includes a duplicated logic doing hex encoding and decoding. While on it, this commit uses better variable names for the source and destination buffers in the existing escape and base64 routines in encode.c and it makes them more robust to overflow detection. The previous core code issued a FATAL after doing out-of-bound writes if going through the SQL functions, which would be enough to detect problems when working on changes that impacted this area of the code. Instead, an error is issued before doing an out-of-bound write. The hex routines were being directly called for bytea conversions and backup manifests without such sanity checks. The current calls happen to not have any problems, but careless uses of such APIs could easily lead to CVE-class bugs. Author: Bruce Momjian, Michael Paquier Reviewed-by: Sehrope Sarkuni Discussion: https://postgr.es/m/[email protected]
2021-01-13Pass down "logically unchanged index" hint.Peter Geoghegan
Add an executor aminsert() hint mechanism that informs index AMs that the incoming index tuple (the tuple that accompanies the hint) is not being inserted by execution of an SQL statement that logically modifies any of the index's key columns. The hint is received by indexes when an UPDATE takes place that does not apply an optimization like heapam's HOT (though only for indexes where all key columns are logically unchanged). Any index tuple that receives the hint on insert is expected to be a duplicate of at least one existing older version that is needed for the same logical row. Related versions will typically be stored on the same index page, at least within index AMs that apply the hint. Recognizing the difference between MVCC version churn duplicates and true logical row duplicates at the index AM level can help with cleanup of garbage index tuples. Cleanup can intelligently target tuples that are likely to be garbage, without wasting too many cycles on less promising tuples/pages (index pages with little or no version churn). This is infrastructure for an upcoming commit that will teach nbtree to perform bottom-up index deletion. No index AM actually applies the hint just yet. Author: Peter Geoghegan <[email protected]> Reviewed-By: Victor Yegorov <[email protected]> Discussion: https://postgr.es/m/CAH2-Wz=CEKFa74EScx_hFVshCOn6AA5T-ajFASTdzipdkLTNQQ@mail.gmail.com
2021-01-13Fix memory leak in SnapBuildSerialize.Amit Kapila
The memory for the snapshot was leaked while serializing it to disk during logical decoding. This memory will be freed only once walsender stops streaming the changes. This can lead to a huge memory increase when master logs Standby Snapshot too frequently say when the user is trying to create many replication slots. Reported-by: [email protected] Diagnosed-by: [email protected] Author: Amit Kapila Backpatch-through: 9.5 Discussion: https://postgr.es/m/[email protected]
2021-01-12Fix relation descriptor leak.Amit Kapila
We missed closing the relation descriptor while sending changes via the root of partitioned relations during logical replication. Author: Amit Langote and Mark Zhao Reviewed-by: Amit Kapila and Ashutosh Bapat Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2021-01-06Fix typos in decode.c and logical.c.Amit Kapila
Per report by Ajin Cherian in email: https://postgr.es/m/CAFPTHDYnRKDvzgDxoMn_CKqXA-D0MtrbyJvfvjBsO4G=UHDXkg@mail.gmail.com
2021-01-05Fix typo in origin.c.Amit Kapila
Author: Peter Smith Discussion: https://postgr.es/m/CAHut+PsReyuvww_Fn1NN_Vsv0wBP1bnzuhzRFr_2=y1nNZrG7w@mail.gmail.com
2021-01-05Fix typo in reorderbuffer.c.Amit Kapila
Author: Zhijie Hou Reviewed-by: Sawada Masahiko Discussion: https://postgr.es/m/ba88bb58aaf14284abca16aec04bf279@G08CNEXMBPEKD05.g08.fujitsu.local
2021-01-04Allow decoding at prepare time in ReorderBuffer.Amit Kapila
This patch allows PREPARE-time decoding of two-phase transactions (if the output plugin supports this capability), in which case the transactions are replayed at PREPARE and then committed later when COMMIT PREPARED arrives. Now that we decode the changes before the commit, the concurrent aborts may cause failures when the output plugin consults catalogs (both system and user-defined). We detect such failures with a special sqlerrcode ERRCODE_TRANSACTION_ROLLBACK introduced by commit 7259736a6e and stop decoding the remaining changes. Then we rollback the changes when rollback prepared is encountered. Author: Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich Reviewed-by: Amit Kapila, Peter Smith, Sawada Masahiko, Arseny Sher, and Dilip Kumar Tested-by: Takamichi Osumi Discussion: https://postgr.es/m/[email protected] https://postgr.es/m/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3+Q@mail.gmail.com
2021-01-02Update copyright for 2021Bruce Momjian
Backpatch-through: 9.5
2020-12-30Extend the output plugin API to allow decoding of prepared xacts.Amit Kapila
This adds six methods to the output plugin API, adding support for streaming changes of two-phase transactions at prepare time. * begin_prepare * filter_prepare * prepare * commit_prepared * rollback_prepared * stream_prepare Most of this is a simple extension of the existing methods, with the semantic difference that the transaction is not yet committed and maybe aborted later. Until now two-phase transactions were translated into regular transactions on the subscriber, and the GID was not forwarded to it. None of the two-phase commands were communicated to the subscriber. This patch provides the infrastructure for logical decoding plugins to be informed of two-phase commands Like PREPARE TRANSACTION, COMMIT PREPARED and ROLLBACK PREPARED commands with the corresponding GID. This also extends the 'test_decoding' plugin, implementing these new methods. This commit simply adds these new APIs and the upcoming patch to "allow the decoding at prepare time in ReorderBuffer" will use these APIs. Author: Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich Reviewed-by: Amit Kapila, Peter Smith, Sawada Masahiko, and Dilip Kumar Discussion: https://postgr.es/m/[email protected] https://postgr.es/m/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3+Q@mail.gmail.com
2020-12-28Revert "Add key management system" (978f869b99) & later commitsBruce Momjian
The patch needs test cases, reorganization, and cfbot testing. Technically reverts commits 5c31afc49d..e35b2bad1a (exclusive/inclusive) and 08db7c63f3..ccbe34139b. Reported-by: Tom Lane, Michael Paquier Discussion: https://postgr.es/m/[email protected]
2020-12-25Add key management systemBruce Momjian
This adds a key management system that stores (currently) two data encryption keys of length 128, 192, or 256 bits. The data keys are AES256 encrypted using a key encryption key, and validated via GCM cipher mode. A command to obtain the key encryption key must be specified at initdb time, and will be run at every database server start. New parameters allow a file descriptor open to the terminal to be passed. pg_upgrade support has also been added. Discussion: https://postgr.es/m/CA+fd4k7q5o6Nc_AaX6BcYM9yqTbC6_pnH-6nSD=54Zp6NBQTCQ@mail.gmail.com Discussion: https://postgr.es/m/[email protected] Author: Masahiko Sawada, me, Stephen Frost
2020-12-19Update comment atop of ReorderBufferQueueMessage().Amit Kapila
The comments atop of this function describes behaviour in case of a transactional WAL message only, but it accepts both transactional and non-transactional WAL messages. Update the comments to describe behaviour in case of non-transactional WAL message as well. Ashutosh Bapat, rephrased by Amit Kapila Discussion: https://postgr.es/m/CAGEoWWTTzNzHOi8bj0wfAo1siGi-YEh6wqH1oaz4DrkTJ6HbTQ@mail.gmail.com
2020-12-15Improve hash_create()'s API for some added robustness.Tom Lane
Invent a new flag bit HASH_STRINGS to specify C-string hashing, which was formerly the default; and add assertions insisting that exactly one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set. This is in hopes of preventing recurrences of the type of oversight fixed in commit a1b8aa1e4 (i.e., mistakenly omitting HASH_BLOBS). Also, when HASH_STRINGS is specified, insist that the keysize be more than 8 bytes. This is a heuristic, but it should catch accidental use of HASH_STRINGS for integer or pointer keys. (Nearly all existing use-cases set the keysize to NAMEDATALEN or more, so there's little reason to think this restriction should be problematic.) Tweak hash_create() to insist that the HASH_ELEM flag be set, and remove the defaults it had for keysize and entrysize. Since those defaults were undocumented and basically useless, no callers omitted HASH_ELEM anyway. Also, remove memset's zeroing the HASHCTL parameter struct from those callers that had one. This has never been really necessary, and while it wasn't a bad coding convention it was confusing that some callers did it and some did not. We might as well save a few cycles by standardizing on "not". Also improve the documentation for hash_create(). In passing, improve reinit.c's usage of a hash table by storing the key as a binary Oid rather than a string; and, since that's a temporary hash table, allocate it in CurrentMemoryContext for neatness. Discussion: https://postgr.es/m/[email protected]
2020-12-15Revert "Cannot use WL_SOCKET_WRITEABLE without WL_SOCKET_READABLE."Jeff Davis
This reverts commit 3a9e64aa0d96c8ffb6c682b082d0f72b1d373327. Commit 4bad60e3 fixed the root of the problem that 3a9e64aa worked around. This enables proper pipelining of commands after terminating replication, eliminating an undocumented limitation. Discussion: https://postgr.es/m/3d57bc29-4459-578b-79cb-7641baf53c57%40iki.fi Backpatch-through: 9.5
2020-12-13Use HASH_BLOBS for xidhash.Noah Misch
This caused BufFile errors on buildfarm member sungazer, and SIGSEGV was possible. Conditions for reaching those symptoms were more frequent on big-endian systems. Discussion: https://postgr.es/m/[email protected]
2020-12-13Correct behavior descriptions in comments, and correct a test name.Noah Misch
2020-12-04Convert elog(LOG) calls to ereport() where appropriatePeter Eisentraut
User-visible log messages should go through ereport(), so they are subject to translation. Many remaining elog(LOG) calls are really debugging calls. Reviewed-by: Alvaro Herrera <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Reviewed-by: Noah Misch <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/92d6f545-5102-65d8-3c87-489f71ea0a37%40enterprisedb.com
2020-12-04Remove incorrect assertion in reorderbuffer.c.Amit Kapila
We start recording changes in ReorderBufferTXN even before we reach SNAPBUILD_CONSISTENT state so that if the commit is encountered after reaching that we should be able to send the changes of the entire transaction. Now, while recording changes if the reorder buffer memory has exceeded logical_decoding_work_mem then we can start streaming if it is allowed and we haven't yet streamed that data. However, we must not allow streaming to start unless the snapshot has reached SNAPBUILD_CONSISTENT state. In passing, improve the comments atop ReorderBufferResetTXN to mention the case when we need to continue streaming after getting an error. Author: Amit Kapila Reviewed-by: Dilip Kumar Discussion: https://postgr.es/m/CAA4eK1KoOH0byboyYY40NBcC7Fe812trwTa+WY3jQF7WQWZbQg@mail.gmail.com
2020-12-04Change SHA2 implementation based on OpenSSL to use EVP digest routinesMichael Paquier
The use of low-level hash routines is not recommended by upstream OpenSSL since 2000, and pgcrypto already switched to EVP as of 5ff4a67. This takes advantage of the refactoring done in 87ae969 that has introduced the allocation and free routines for cryptographic hashes. Since 1.1.0, OpenSSL does not publish the contents of the cryptohash contexts, forcing any consumers to rely on OpenSSL for all allocations. Hence, the resource owner callback mechanism gains a new set of routines to track and free cryptohash contexts when using OpenSSL, preventing any risks of leaks in the backend. Nothing is needed in the frontend thanks to the refactoring of 87ae969, and the resowner knowledge is isolated into cryptohash_openssl.c. Note that this also fixes a failure with SCRAM authentication when using FIPS in OpenSSL, but as there have been few complaints about this problem and as this causes an ABI breakage, no backpatch is done. Author: Michael Paquier Reviewed-by: Daniel Gustafsson, Heikki Linnakangas Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2020-12-02Move SHA2 routines to a new generic API layer for crypto hashesMichael Paquier
Two new routines to allocate a hash context and to free it are created, as these become necessary for the goal behind this refactoring: switch the all cryptohash implementations for OpenSSL to use EVP (for FIPS and also because upstream does not recommend the use of low-level cryptohash functions for 20 years). Note that OpenSSL hides the internals of cryptohash contexts since 1.1.0, so it is necessary to leave the allocation to OpenSSL itself, explaining the need for those two new routines. This part is going to require more work to properly track hash contexts with resource owners, but this not introduced here. Still, this refactoring makes the move possible. This reduces the number of routines for all SHA2 implementations from twelve (SHA{224,256,386,512} with init, update and final calls) to five (create, free, init, update and final calls) by incorporating the hash type directly into the hash context data. The new cryptohash routines are moved to a new file, called cryptohash.c for the fallback implementations, with SHA2 specifics becoming a part internal to src/common/. OpenSSL specifics are part of cryptohash_openssl.c. This infrastructure is usable for more hash types, like MD5 or HMAC. Any code paths using the internal SHA2 routines are adapted to report correctly errors, which are most of the changes of this commit. The zones mostly impacted are checksum manifests, libpq and SCRAM. Note that e21cbb4 was a first attempt to switch SHA2 to EVP, but it lacked the refactoring needed for libpq, as done here. This patch has been tested on Linux and Windows, with and without OpenSSL, and down to 1.0.1, the oldest version supported on HEAD. Author: Michael Paquier Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/[email protected]
2020-11-27Fix replication of in-progress transactions in tablesync worker.Amit Kapila
Tablesync worker runs under a single transaction but in streaming mode, we were committing the transaction on stream_stop, stream_abort, and stream_commit. We need to avoid committing the transaction in a streaming mode in tablesync worker. In passing move the call to process_syncing_tables in apply_handle_stream_commit after clean up of stream files. This will allow clean up of files to happen before the exit of tablesync worker which would otherwise be handled by one of the proc exit routines. Author: Dilip Kumar Reviewed-by: Amit Kapila and Peter Smith Tested-by: Peter Smith Discussion: https://postgr.es/m/CAHut+Pt4PyKQCwqzQ=EFF=bpKKJD7XKt_S23F6L20ayQNxg77A@mail.gmail.com
2020-11-26Restore lock level to update statusFlagsAlvaro Herrera
Reverts 27838981be9d (some comments are kept). Per discussion, it does not seem safe to relax the lock level used for this; in order for it to be safe, there would have to be memory barriers between the point we set the flag and the point we set the trasaction Xid, which perhaps would not be so bad; but there would also have to be barriers at the readers' side, which from a performance perspective might be bad. Now maybe this analysis is wrong and it *is* safe for some reason, but proof of that is not trivial. Discussion: https://postgr.es/m/[email protected]
2020-11-26Use Enums for logical replication message types at more places.Amit Kapila
Commit 644f0d7cc9 added logical replication message type enums to use instead of character literals but some char substitutions were overlooked. Author: Peter Smith Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAHut+PsTG=Vrv8hgrvOnAvCNR21jhqMdPk2n0a1uJPoW0p+UfQ@mail.gmail.com
2020-11-24Centralize logic for skipping useless ereport/elog calls.Tom Lane
While ereport() and elog() themselves are quite cheap when the error message level is too low to be printed, some places need to do substantial work before they can call those macros at all. To allow optimizing away such setup work when nothing is to be printed, make elog.c export a new function message_level_is_interesting(elevel) that reports whether ereport/elog will do anything. Make use of that in various places that had ad-hoc direct tests of log_min_messages etc. Also teach ProcSleep to use it to avoid some work. (There may well be other places that could usefully use this; I didn't search hard.) Within elog.c, refactor a little bit to avoid having duplicate copies of the policy-setting logic. When that code was written, we weren't relying on the availability of inline functions; so it had some duplications in the name of efficiency, which I got rid of. Alvaro Herrera and Tom Lane Discussion: https://postgr.es/m/[email protected]
2020-11-23Split copy.c into four files.Heikki Linnakangas
Copy.c has grown really large. Split it into more manageable parts: - copy.c now contains only a few functions that are common to COPY FROM and COPY TO. - copyto.c contains code for COPY TO. - copyfrom.c contains code for initializing COPY FROM, and inserting the tuples to the correct table. - copyfromparse.c contains code for reading from the client/file/program, and parsing the input text/CSV/binary format into tuples. All of these parts are fairly complicated, and fairly independent of each other. There is a patch being discussed to implement parallel COPY FROM, which will add a lot of new code to the COPY FROM path, and another patch which would allow INSERTs to use the same multi-insert machinery as COPY FROM, both of which will require refactoring that code. With those two patches, there's going to be a lot of code churn in copy.c anyway, so now seems like a good time to do this refactoring. The CopyStateData struct is also split. All the formatting options, like FORMAT, QUOTE, ESCAPE, are put in a new CopyFormatOption struct, which is used by both COPY FROM and TO. Other state data are kept in separate CopyFromStateData and CopyToStateData structs. Reviewed-by: Soumyadeep Chakraborty, Erik Rijkers, Vignesh C, Andres Freund Discussion: https://www.postgresql.org/message-id/8e15b560-f387-7acc-ac90-763986617bfb%40iki.fi
2020-11-18Relax lock level for setting PGPROC->statusFlagsAlvaro Herrera
We don't actually need a lock to set PGPROC->statusFlags itself; what we do need is a shared lock on either XidGenLock or ProcArrayLock in order to ensure MyProc->pgxactoff keeps still while we modify the mirror array in ProcGlobal->statusFlags. Some places were using an exclusive lock for that, which is excessive. Relax those to use shared lock only. procarray.c has a couple of places with somewhat brittle assumptions about PGPROC changes: ProcArrayEndTransaction uses only shared lock, so it's permissible to change MyProc only. On the other hand, ProcArrayEndTransactionInternal also changes other procs, so it must hold exclusive lock. Add asserts to ensure those assumptions continue to hold. Author: Álvaro Herrera <[email protected]> Reviewed-by: Michael Paquier <[email protected]> Discussion: https://postgr.es/m/[email protected]
2020-11-17Fix 'skip-empty-xacts' option in test_decoding for streaming mode.Amit Kapila
In streaming mode, the transaction can be decoded in multiple streams and those streams can be interleaved with streams of other transactions. So, we can't remember the transaction's write status in the logical decoding context because that might get changed due to some other transactions and lead to wrong answers for 'skip-empty-xacts' option. We decided to keep each transaction's write status in the ReorderBufferTxn to avoid interleaved streams changing the status of some unrelated transactions. Diagnosed-by: Amit Kapila Author: Dilip Kumar Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/CAA4eK1LR7=XNM_TLmpZMFuV8ZQpoxkem--NZJYf8YXmesbvwLA@mail.gmail.com
2020-11-16Rename PGPROC->vacuumFlags to statusFlagsAlvaro Herrera
With more flags associated to a PGPROC entry that are not related to vacuum (currently existing or planned), the name "statusFlags" describes its purpose better. (The same is done to the mirroring PROC_HDR->vacuumFlags.) No functional changes in this commit. This was suggested first by Hari Babu Kommi in [1] and then by Michael Paquier at [2]. [1] https://postgr.es/m/CAJrrPGcsDC-oy1AhqH0JkXYa0Z2AgbuXzHPpByLoBGMxfOZMEQ@mail.gmail.com [2] https://postgr.es/m/[email protected] Author: Dmitry Dolgov <[email protected]> Reviewed-by: Álvaro Herrera <[email protected]> Discussion: https://postgr.es/m/20201116182446.qcg3o6szo2zookyr@localhost
2020-11-12change wire protocol data type for history file contentBruce Momjian
This was marked as BYTEA, but is more like TEXT, which is how we already pass the history timeline file name. Internally, we don't do any encoding or bytea escape handling, but TEXT seems closest. This should cause no behavioral change. Reported-by: Brar Piening Discussion: https://postgr.es/m/[email protected] Backpatch-through: master
2020-11-12Use standard SIGHUP and SIGTERM handlers in walreceiver.Fujii Masao
Commit 1e53fe0e70 changed background processes so that they use standard SIGHUP handler. Like that, this commit makes walreceiver use standard SIGHUP and SIGTERM handlers, to simplify the code. As the side effect of this commit, walreceiver can wake up and process the configuration files promptly when receiving SIGHUP. Because the standard SIGHUP handler sets the latch. On the other hand, previously there could be a time lag between the receipt of SIGHUP and the process of configuration files since the dedicated handler didn't set the latch. Author: Bharath Rupireddy, tweaked by Fujii Masao Reviewed-by: Kyotaro Horiguchi, Fujii Masao Discussion: https://postgr.es/m/CALj2ACXPorUqePswDtOeM_s82v9RW32E1fYmOPZ5NuE+TWKj_A@mail.gmail.com
2020-11-12Remove useless SHA256 initialization when not using backup manifestsMichael Paquier
Attempting to take a base backup with Postgres linking to a build of OpenSSL with FIPS enabled currently fails with or even without a backup manifest requested because of this mandatory SHA256 initialization used for the manifest file itself. However, there is no need to do this initialization at all if backup manifests are not needed because there is no data to append to the manifest. Note that being able to use backup manifests with OpenSSL+FIPS requires a switch of the SHA2 implementation to use EVP, which would cause an ABI breakage so this cannot be backpatched to 13 as it has been already released, but at least avoiding this SHA256 initialization gives users the possibility to take a base backup even when specifying --no-manifest with pg_basebackup. Author: Michael Paquier Discussion: https://postgr.es/m/[email protected] Backpatch-through: 13
2020-11-11Fix and simplify some usages of TimestampDifference().Tom Lane
Introduce TimestampDifferenceMilliseconds() to simplify callers that would rather have the difference in milliseconds, instead of the select()-oriented seconds-and-microseconds format. This gets rid of at least one integer division per call, and it eliminates some apparently-easy-to-mess-up arithmetic. Two of these call sites were in fact wrong: * pg_prewarm's autoprewarm_main() forgot to multiply the seconds by 1000, thus ending up with a delay 1000X shorter than intended. That doesn't quite make it a busy-wait, but close. * postgres_fdw's pgfdw_get_cleanup_result() thought it needed to compute microseconds not milliseconds, thus ending up with a delay 1000X longer than intended. Somebody along the way had noticed this problem but misdiagnosed the cause, and imposed an ad-hoc 60-second limit rather than fixing the units. This was relatively harmless in context, because we don't care that much about exactly how long this delay is; still, it's wrong. There are a few more callers of TimestampDifference() that don't have a direct need for seconds-and-microseconds, but can't use TimestampDifferenceMilliseconds() either because they do need microsecond precision or because they might possibly deal with intervals long enough to overflow 32-bit milliseconds. It might be worth inventing another API to improve that, but that seems outside the scope of this patch; so those callers are untouched here. Given the fact that we are fixing some bugs, and the likelihood that future patches might want to back-patch code that uses this new API, back-patch to all supported branches. Alexey Kondratov and Tom Lane Discussion: https://postgr.es/m/[email protected]
2020-11-07Move catalog index declarationsPeter Eisentraut
Move the system catalog index declarations from catalog/indexing.h to the respective parent tables' catalog/pg_*.h files. The original reason for having it split was that the old genbki system produced the output in the order of the catalog files it read, so all the indexing stuff needed to come separately. But this is no longer the case, and keeping it together makes more sense. Reviewed-by: John Naylor <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/[email protected]
2020-11-02Use Enum for top level logical replication message types.Amit Kapila
Logical replication protocol uses a single byte character to identify a message type in logical replication protocol. The code uses string literals for the same. Use Enum so that 1. All the string literals used can be found at a single place. This makes it easy to add more types without the risk of conflicts. 2. It's easy to locate the code handling a given message type. 3. When used with switch statements, it is easy to identify the missing cases using -Wswitch. Author: Ashutosh Bapat Reviewed-by: Kyotaro Horiguchi, Andres Freund, Peter Smith and Amit Kapila Discussion: https://postgr.es/m/CAExHW5uPzQ7L0oAd_ENyvaiYMOPgkrAoJpE+ZY5-obdcVT6NPg@mail.gmail.com
2020-10-29Track statistics for streaming of changes from ReorderBuffer.Amit Kapila
This adds the statistics about transactions streamed to the decoding output plugin from ReorderBuffer. Users can query the pg_stat_replication_slots view to check these stats and call pg_stat_reset_replication_slot to reset the stats of a particular slot. Users can pass NULL in pg_stat_reset_replication_slot to reset stats of all the slots. Commit 9868167500 has added the basic infrastructure to capture the stats of slot and this commit extends the statistics collector to track additional information about slots. Bump the catversion as we have added new columns in the catalog entry. Author: Ajin Cherian and Amit Kapila Reviewed-by: Sawada Masahiko and Dilip Kumar Discussion: https://postgr.es/m/CAA4eK1+chpEomLzgSoky-D31qev19AmECNiEAietPQUGEFhtVA@mail.gmail.com
2020-10-28Calculate extraUpdatedCols in query rewriter, not parser.Tom Lane
It's unsafe to do this at parse time because addition of generated columns to a table would not invalidate stored rules containing UPDATEs on the table ... but there might now be dependent generated columns that were not there when the rule was made. This also fixes an oversight that rewriteTargetView failed to update extraUpdatedCols when transforming an UPDATE on an updatable view. (Since the new calculation is downstream of that, rewriteTargetView doesn't actually need to do anything; but before, there was a demonstrable bug there.) In v13 and HEAD, this leads to easily-visible bugs because (since commit c6679e4fc) we won't recalculate generated columns that aren't listed in extraUpdatedCols. In v12 this bitmap is mostly just used for trigger-firing decisions, so you'd only notice a problem if a trigger cared whether a generated column had been updated. I'd complained about this back in May, but then forgot about it until bug #16671 from Michael Paul Killian revived the issue. Back-patch to v12 where this field was introduced. If existing stored rules contain any extraUpdatedCols values, they'll be ignored because the rewriter will overwrite them, so the bug will be fixed even for existing rules. (But note that if someone were to update to 13.1 or 12.5, store some rules with UPDATEs on tables having generated columns, and then downgrade to a prior minor version, they might observe issues similar to what this patch fixes. That seems unlikely enough to not be worth going to a lot of effort to fix.) Discussion: https://postgr.es/m/[email protected] Discussion: https://postgr.es/m/[email protected]
2020-10-19Remove PartitionRoutingInfo struct.Heikki Linnakangas
The extra indirection neeeded to access its members via its enclosing ResultRelInfo seems pointless. Move all the fields from PartitionRoutingInfo to ResultRelInfo. Author: Amit Langote Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/CA%2BHiwqFViT47Zbr_ASBejiK7iDG8%3DQ1swQ-tjM6caRPQ67pT%3Dw%40mail.gmail.com
2020-10-15Review logical replication tablesync codeAlvaro Herrera
Most importantly, remove optimization in LogicalRepSyncTableStart that skips the normal walrcv_startstreaming/endstreaming dance. The optimization is not critically important for production uses anyway, since it only fires in cases with no activity, and saves an uninteresting amount of work even then. Critically, it obscures bugs by hiding the interesting code path from test cases. Also: in GetSubscriptionRelState, remove pointless relation open; access pg_subscription_rel->srsubstate with GETSTRUCT as is typical rather than SysCacheGetAttr; remove unused 'missing_ok' argument. In wait_for_relation_state_change, use explicit catalog snapshot invalidation rather than obscurely (and expensively) through GetLatestSnapshot. In various places: sprinkle comments more liberally and rewrite a number of them. Other cosmetic code improvements. No backpatch, since no bug is being fixed here. Author: Álvaro Herrera <[email protected]> Reviewed-by: Petr Jelínek <[email protected]> Discussion: https://postgr.es/m/[email protected]
2020-10-15Fixup some appendStringInfo and appendPQExpBuffer callsDavid Rowley
A number of places were using appendStringInfo() when they could have been using appendStringInfoString() instead. While there's no functionality change there, it's just more efficient to use appendStringInfoString() when no formatting is required. Likewise for some appendStringInfoString() calls which were just appending a single char. We can just use appendStringInfoChar() for that. Additionally, many places were using appendPQExpBuffer() when they could have used appendPQExpBufferStr(). Change those too. Patch by Zhijie Hou, but further searching by me found significantly more places that deserved the same treatment. Author: Zhijie Hou, David Rowley Discussion: https://postgr.es/m/cb172cf4361e4c7ba7167429070979d4@G08CNEXMBPEKD05.g08.fujitsu.local
2020-10-15Execute invalidation messages for each XLOG_XACT_INVALIDATIONS messageAmit Kapila
during logical decoding. Prior to commit c55040ccd0 we have no way of knowing the invalidations before commit. So, while decoding we use to execute all the invalidations at each command end as we had no way of knowing which invalidations happened before that command. Due to this, transactions involving large amounts of DDLs use to take more time and also lead to high CPU usage. But now we know specific invalidations at each command end so we execute only required invalidations. It has been observed that decoding of a transaction containing truncation of a table with 1000 partitions would be finished in 1s whereas before this patch it used to take 4-5 minutes. Author: Dilip Kumar Reviewed-by: Amit Kapila and Keisuke Kuroda Discussion: https://postgr.es/m/CANDwggKYveEtXjXjqHA6RL3AKSHMsQyfRY6bK+NqhAWJyw8psQ@mail.gmail.com
2020-10-14Restore replication protocol's duplicate command tagsAlvaro Herrera
I removed the duplicate command tags for START_REPLICATION inadvertently in commit 07082b08cc5d, but the replication protocol requires them. The fact that the replication protocol was broken was not noticed because all our test cases use an optimized code path that exits early, failing to verify that the behavior is correct for non-optimized cases. Put them back. Also document this protocol quirk. Add a test case that shows the failure. It might still succeed even without the patch when run on a fast enough server, but it suffices to show the bug in enough cases that it would be noticed in buildfarm. Author: Álvaro Herrera <[email protected]> Reported-by: Henry Hinze <[email protected]> Reviewed-by: Petr Jelínek <[email protected]> Discussion: https://postgr.es/m/[email protected]
2020-10-14Remove es_result_relation_info from EState.Heikki Linnakangas
Maintaining 'es_result_relation_info' correctly at all times has become cumbersome, especially with partitioning where each partition gets its own result relation info. Having to set and reset it across arbitrary operations has caused bugs in the past. This changes all the places that used 'es_result_relation_info', to receive the currently active ResultRelInfo via function parameters instead. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com
2020-10-13Create ResultRelInfos later in InitPlan, index them by RT index.Heikki Linnakangas
Instead of allocating all the ResultRelInfos upfront in one big array, allocate them in ExecInitModifyTable(). es_result_relations is now an array of ResultRelInfo pointers, rather than an array of structs, and it is indexed by the RT index. This simplifies things: we get rid of the separate concept of a "result rel index", and don't need to set it in setrefs.c anymore. This also allows follow-up optimizations (not included in this commit yet) to skip initializing ResultRelInfos for target relations that were not needed at runtime, and removal of the es_result_relation_info pointer. The EState arrays of regular result rels and root result rels are merged into one array. Similarly, the resultRelations and rootResultRelations lists in PlannedStmt are merged into one. It's not actually clear to me why they were kept separate in the first place, but now that the es_result_relations array is indexed by RT index, it certainly seems pointless. The PlannedStmt->resultRelations list is now only needed for ExecRelationIsTargetRelation(). One visible effect of this change is that ExecRelationIsTargetRelation() will now return 'true' also for the partition root, if a partitioned table is updated. That seems like a good thing, although the function isn't used in core code, and I don't see any reason for an FDW to call it on a partition root. Author: Amit Langote Discussion: https://www.postgresql.org/message-id/CA%2BHiwqGEmiib8FLiHMhKB%2BCH5dRgHSLc5N5wnvc4kym%2BZYpQEQ%40mail.gmail.com