summaryrefslogtreecommitdiff
path: root/doc/src
diff options
context:
space:
mode:
authorAmit Kapila2024-02-14 04:15:36 +0000
committerAmit Kapila2024-02-14 04:15:36 +0000
commitddd5f4f54a026db6a6692876d0d44aef902ab686 (patch)
tree68d374eb80a2a16eb0b011f58e3df25de7878d50 /doc/src
parent06bd311bce24083c76d9741ae89c98750aaf4b41 (diff)
Add a slot synchronization function.
This commit introduces a new SQL function pg_sync_replication_slots() which is used to synchronize the logical replication slots from the primary server to the physical standby so that logical replication can be resumed after a failover or planned switchover. A new 'synced' flag is introduced in pg_replication_slots view, indicating whether the slot has been synchronized from the primary server. On a standby, synced slots cannot be dropped or consumed, and any attempt to perform logical decoding on them will result in an error. The logical replication slots on the primary can be synchronized to the hot standby by using the 'failover' parameter of pg-create-logical-replication-slot(), or by using the 'failover' option of CREATE SUBSCRIPTION during slot creation, and then calling pg_sync_replication_slots() on standby. For the synchronization to work, it is mandatory to have a physical replication slot between the primary and the standby aka 'primary_slot_name' should be configured on the standby, and 'hot_standby_feedback' must be enabled on the standby. It is also necessary to specify a valid 'dbname' in the 'primary_conninfo'. If a logical slot is invalidated on the primary, then that slot on the standby is also invalidated. If a logical slot on the primary is valid but is invalidated on the standby, then that slot is dropped but will be recreated on the standby in the next pg_sync_replication_slots() call provided the slot still exists on the primary server. It is okay to recreate such slots as long as these are not consumable on standby (which is the case currently). This situation may occur due to the following reasons: - The 'max_slot_wal_keep_size' on the standby is insufficient to retain WAL records from the restart_lsn of the slot. - 'primary_slot_name' is temporarily reset to null and the physical slot is removed. The slot synchronization status on the standby can be monitored using the 'synced' column of pg_replication_slots view. A functionality to automatically synchronize slots by a background worker and allow logical walsenders to wait for the physical will be done in subsequent commits. Author: Hou Zhijie, Shveta Malik, Ajin Cherian based on an earlier version by Peter Eisentraut Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Nisha Moond, Kuroda Hayato, Amit Kapila Discussion: https://postgr.es/m/[email protected]
Diffstat (limited to 'doc/src')
-rw-r--r--doc/src/sgml/config.sgml9
-rw-r--r--doc/src/sgml/func.sgml35
-rw-r--r--doc/src/sgml/logicaldecoding.sgml56
-rw-r--r--doc/src/sgml/protocol.sgml6
-rw-r--r--doc/src/sgml/system-views.sgml20
5 files changed, 119 insertions, 7 deletions
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 61038472c5a..037a3b8a64c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4612,8 +4612,13 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
<varname>primary_conninfo</varname> string, or in a separate
<filename>~/.pgpass</filename> file on the standby server (use
<literal>replication</literal> as the database name).
- Do not specify a database name in the
- <varname>primary_conninfo</varname> string.
+ </para>
+ <para>
+ For replication slot synchronization (see
+ <xref linkend="logicaldecoding-replication-slots-synchronization"/>),
+ it is also necessary to specify a valid <literal>dbname</literal>
+ in the <varname>primary_conninfo</varname> string. This will only be
+ used for slot synchronization. It is ignored for streaming.
</para>
<para>
This parameter can only be set in the <filename>postgresql.conf</filename>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 11d537b341c..8f147a2417f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28075,7 +28075,7 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
</row>
<row>
- <entry role="func_table_entry"><para role="func_signature">
+ <entry id="pg-create-logical-replication-slot" role="func_table_entry"><para role="func_signature">
<indexterm>
<primary>pg_create_logical_replication_slot</primary>
</indexterm>
@@ -28444,6 +28444,39 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
record is flushed along with its transaction.
</para></entry>
</row>
+
+ <row>
+ <entry id="pg-sync-replication-slots" role="func_table_entry"><para role="func_signature">
+ <indexterm>
+ <primary>pg_sync_replication_slots</primary>
+ </indexterm>
+ <function>pg_sync_replication_slots</function> ()
+ <returnvalue>void</returnvalue>
+ </para>
+ <para>
+ Synchronize the logical failover replication slots from the primary
+ server to the standby server. This function can only be executed on the
+ standby server. Temporary synced slots, if any, cannot be used for
+ logical decoding and must be dropped after promotion. See
+ <xref linkend="logicaldecoding-replication-slots-synchronization"/> for details.
+ </para>
+
+ <caution>
+ <para>
+ If, after executing the function,
+ <link linkend="guc-hot-standby-feedback">
+ <varname>hot_standby_feedback</varname></link> is disabled on
+ the standby or the physical slot configured in
+ <link linkend="guc-primary-slot-name">
+ <varname>primary_slot_name</varname></link> is
+ removed, then it is possible that the necessary rows of the
+ synchronized slot will be removed by the VACUUM process on the primary
+ server, resulting in the synchronized slot becoming invalidated.
+ </para>
+ </caution>
+ </entry>
+ </row>
+
</tbody>
</tgroup>
</table>
diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml
index cd152d4ced9..eceaaaa2735 100644
--- a/doc/src/sgml/logicaldecoding.sgml
+++ b/doc/src/sgml/logicaldecoding.sgml
@@ -358,6 +358,62 @@ postgres=# select * from pg_logical_slot_get_changes('regression_slot', NULL, NU
So if a slot is no longer required it should be dropped.
</para>
</caution>
+
+ </sect2>
+
+ <sect2 id="logicaldecoding-replication-slots-synchronization">
+ <title>Replication Slot Synchronization</title>
+ <para>
+ The logical replication slots on the primary can be synchronized to
+ the hot standby by using the <literal>failover</literal> parameter of
+ <link linkend="pg-create-logical-replication-slot">
+ <function>pg_create_logical_replication_slot</function></link>, or by
+ using the <link linkend="sql-createsubscription-params-with-failover">
+ <literal>failover</literal></link> option of
+ <command>CREATE SUBSCRIPTION</command> during slot creation, and then calling
+ <link linkend="pg-sync-replication-slots">
+ <function>pg_sync_replication_slots</function></link>
+ on the standby. For the synchronization to work, it is mandatory to
+ have a physical replication slot between the primary and the standby aka
+ <link linkend="guc-primary-slot-name"><varname>primary_slot_name</varname></link>
+ should be configured on the standby, and
+ <link linkend="guc-hot-standby-feedback"><varname>hot_standby_feedback</varname></link>
+ must be enabled on the standby. It is also necessary to specify a valid
+ <literal>dbname</literal> in the
+ <link linkend="guc-primary-conninfo"><varname>primary_conninfo</varname></link>.
+ </para>
+
+ <para>
+ The ability to resume logical replication after failover depends upon the
+ <link linkend="view-pg-replication-slots">pg_replication_slots</link>.<structfield>synced</structfield>
+ value for the synchronized slots on the standby at the time of failover.
+ Only persistent slots that have attained synced state as true on the standby
+ before failover can be used for logical replication after failover.
+ Temporary synced slots cannot be used for logical decoding, therefore
+ logical replication for those slots cannot be resumed. For example, if the
+ synchronized slot could not become persistent on the standby due to a
+ disabled subscription, then the subscription cannot be resumed after
+ failover even when it is enabled.
+ </para>
+
+ <para>
+ To resume logical replication after failover from the synced logical
+ slots, the subscription's 'conninfo' must be altered to point to the
+ new primary server. This is done using
+ <link linkend="sql-altersubscription-params-connection"><command>ALTER SUBSCRIPTION ... CONNECTION</command></link>.
+ It is recommended that subscriptions are first disabled before promoting
+ the standby and are re-enabled after altering the connection string.
+ </para>
+ <caution>
+ <para>
+ There is a chance that the old primary is up again during the promotion
+ and if subscriptions are not disabled, the logical subscribers may
+ continue to receive data from the old primary server even after promotion
+ until the connection string is altered. This might result in data
+ inconsistency issues, preventing the logical subscribers from being
+ able to continue replication from the new primary server.
+ </para>
+ </caution>
</sect2>
<sect2 id="logicaldecoding-explanation-output-plugins">
diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 05d6cc42da3..a5cb19357f5 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2062,7 +2062,8 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
<term><literal>FAILOVER [ <replaceable class="parameter">boolean</replaceable> ]</literal></term>
<listitem>
<para>
- If true, the slot is enabled to be synced to the standbys.
+ If true, the slot is enabled to be synced to the standbys
+ so that logical replication can be resumed after failover.
The default is false.
</para>
</listitem>
@@ -2162,7 +2163,8 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
<term><literal>FAILOVER [ <replaceable class="parameter">boolean</replaceable> ]</literal></term>
<listitem>
<para>
- If true, the slot is enabled to be synced to the standbys.
+ If true, the slot is enabled to be synced to the standbys
+ so that logical replication can be resumed after failover.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index dd468b31ea7..be90edd0e20 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2561,10 +2561,26 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
<structfield>failover</structfield> <type>bool</type>
</para>
<para>
- True if this is a logical slot enabled to be synced to the standbys.
- Always false for physical slots.
+ True if this is a logical slot enabled to be synced to the standbys
+ so that logical replication can be resumed from the new primary
+ after failover. Always false for physical slots.
</para></entry>
</row>
+
+ <row>
+ <entry role="catalog_table_entry"><para role="column_definition">
+ <structfield>synced</structfield> <type>bool</type>
+ </para>
+ <para>
+ True if this is a logical slot that was synced from a primary server.
+ On a hot standby, the slots with the synced column marked as true can
+ neither be used for logical decoding nor dropped manually. The value
+ of this column has no meaning on the primary server; the column value on
+ the primary is default false for all slots but may (if leftover from a
+ promoted standby) also be true.
+ </para></entry>
+ </row>
+
</tbody>
</tgroup>
</table>