Skip to content

Commit 9b44a50

Browse files
Hou ZhijieCommitfest Bot
Hou Zhijie
authored and
Commitfest Bot
committed
Maintain the replication slot in logical launcher to retain dead tuples
This patch enables the logical replication launcher to create and maintain a replication slot named pg_conflict_detection. The launcher periodically collects the oldest_nonremovable_xid from all apply workers. It then computes the minimum transaction ID and advances the xmin value of the replication slot if it precedes the computed value. The interval for updating the slot (nap time) is dynamically adjusted based on the activity of the apply workers. The launcher waits for a certain period before performing the next update, with the duration varying depending on whether the xmin value of the replication slot was updated during the last cycle.
1 parent 5380e8a commit 9b44a50

File tree

12 files changed

+245
-12
lines changed

12 files changed

+245
-12
lines changed

doc/src/sgml/config.sgml

+2
Original file line numberDiff line numberDiff line change
@@ -4945,6 +4945,8 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
49454945
new setting.
49464946
This setting has no effect if <varname>primary_conninfo</varname> is not
49474947
set or the server is not in standby mode.
4948+
The name cannot be <literal>pg_conflict_detection</literal>, as it is
4949+
reserved for logical replication conflict detection.
49484950
</para>
49494951
</listitem>
49504952
</varlistentry>

doc/src/sgml/func.sgml

+11-3
Original file line numberDiff line numberDiff line change
@@ -29710,7 +29710,9 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
2971029710
</para>
2971129711
<para>
2971229712
Creates a new physical replication slot named
29713-
<parameter>slot_name</parameter>. The optional second parameter,
29713+
<parameter>slot_name</parameter>. The name cannot be
29714+
<literal>pg_conflict_detection</literal>, as it is reserved for
29715+
logical replication conflict detection. The optional second parameter,
2971429716
when <literal>true</literal>, specifies that the <acronym>LSN</acronym> for this
2971529717
replication slot be reserved immediately; otherwise
2971629718
the <acronym>LSN</acronym> is reserved on first connection from a streaming
@@ -29754,7 +29756,9 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
2975429756
<para>
2975529757
Creates a new logical (decoding) replication slot named
2975629758
<parameter>slot_name</parameter> using the output plugin
29757-
<parameter>plugin</parameter>. The optional third
29759+
<parameter>plugin</parameter>. The name cannot be
29760+
<literal>pg_conflict_detection</literal>, as it is reserved for
29761+
logical replication conflict detection. The optional third
2975829762
parameter, <parameter>temporary</parameter>, when set to true, specifies that
2975929763
the slot should not be permanently stored to disk and is only meant
2976029764
for use by the current session. Temporary slots are also
@@ -29784,6 +29788,8 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
2978429788
<para>
2978529789
Copies an existing physical replication slot named <parameter>src_slot_name</parameter>
2978629790
to a physical replication slot named <parameter>dst_slot_name</parameter>.
29791+
The new slot name cannot be <literal>pg_conflict_detection</literal>,
29792+
as it is reserved for logical replication conflict detection.
2978729793
The copied physical slot starts to reserve WAL from the same <acronym>LSN</acronym> as the
2978829794
source slot.
2978929795
<parameter>temporary</parameter> is optional. If <parameter>temporary</parameter>
@@ -29806,7 +29812,9 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
2980629812
Copies an existing logical replication slot
2980729813
named <parameter>src_slot_name</parameter> to a logical replication
2980829814
slot named <parameter>dst_slot_name</parameter>, optionally changing
29809-
the output plugin and persistence. The copied logical slot starts
29815+
the output plugin and persistence. The name cannot be
29816+
<literal>pg_conflict_detection</literal>, as it is reserved for
29817+
logical replication conflict detection. The copied logical slot starts
2981029818
from the same <acronym>LSN</acronym> as the source logical slot. Both
2981129819
<parameter>temporary</parameter> and <parameter>plugin</parameter> are
2981229820
optional; if they are omitted, the values of the source slot are used.

doc/src/sgml/protocol.sgml

+2
Original file line numberDiff line numberDiff line change
@@ -2220,6 +2220,8 @@ psql "dbname=postgres replication=database" -c "IDENTIFY_SYSTEM;"
22202220
<para>
22212221
The name of the slot to create. Must be a valid replication slot
22222222
name (see <xref linkend="streaming-replication-slots-manipulation"/>).
2223+
The name cannot be <literal>pg_conflict_detection</literal>, as it
2224+
is reserved for logical replication conflict detection.
22232225
</para>
22242226
</listitem>
22252227
</varlistentry>

doc/src/sgml/ref/create_subscription.sgml

+3-1
Original file line numberDiff line numberDiff line change
@@ -169,7 +169,9 @@ CREATE SUBSCRIPTION <replaceable class="parameter">subscription_name</replaceabl
169169
<listitem>
170170
<para>
171171
Name of the publisher's replication slot to use. The default is
172-
to use the name of the subscription for the slot name.
172+
to use the name of the subscription for the slot name. The name cannot
173+
be <literal>pg_conflict_detection</literal>, as it is reserved for
174+
logical replication conflict detection.
173175
</para>
174176

175177
<para>

src/backend/access/transam/xlogrecovery.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -4760,7 +4760,7 @@ bool
47604760
check_primary_slot_name(char **newval, void **extra, GucSource source)
47614761
{
47624762
if (*newval && strcmp(*newval, "") != 0 &&
4763-
!ReplicationSlotValidateName(*newval, WARNING))
4763+
!ReplicationSlotValidateName(*newval, false, WARNING))
47644764
return false;
47654765

47664766
return true;

src/backend/commands/subscriptioncmds.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -210,7 +210,7 @@ parse_subscription_options(ParseState *pstate, List *stmt_options,
210210
if (strcmp(opts->slot_name, "none") == 0)
211211
opts->slot_name = NULL;
212212
else
213-
ReplicationSlotValidateName(opts->slot_name, ERROR);
213+
ReplicationSlotValidateName(opts->slot_name, false, ERROR);
214214
}
215215
else if (IsSet(supported_opts, SUBOPT_COPY_DATA) &&
216216
strcmp(defel->defname, "copy_data") == 0)

src/backend/replication/logical/launcher.c

+178-2
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
#include "postmaster/interrupt.h"
3333
#include "replication/logicallauncher.h"
3434
#include "replication/origin.h"
35+
#include "replication/slot.h"
3536
#include "replication/walreceiver.h"
3637
#include "replication/worker_internal.h"
3738
#include "storage/ipc.h"
@@ -91,7 +92,6 @@ static dshash_table *last_start_times = NULL;
9192
static bool on_commit_launcher_wakeup = false;
9293

9394

94-
static void ApplyLauncherWakeup(void);
9595
static void logicalrep_launcher_onexit(int code, Datum arg);
9696
static void logicalrep_worker_onexit(int code, Datum arg);
9797
static void logicalrep_worker_detach(void);
@@ -100,6 +100,9 @@ static int logicalrep_pa_worker_count(Oid subid);
100100
static void logicalrep_launcher_attach_dshmem(void);
101101
static void ApplyLauncherSetWorkerStartTime(Oid subid, TimestampTz start_time);
102102
static TimestampTz ApplyLauncherGetWorkerStartTime(Oid subid);
103+
static void create_conflict_slot_if_not_exists(void);
104+
static void advance_conflict_slot_xmin(FullTransactionId new_xmin);
105+
static void drop_conflict_slot_if_exists(void);
103106

104107

105108
/*
@@ -1106,7 +1109,10 @@ ApplyLauncherWakeupAtCommit(void)
11061109
on_commit_launcher_wakeup = true;
11071110
}
11081111

1109-
static void
1112+
/*
1113+
* Wakeup the launcher immediately.
1114+
*/
1115+
void
11101116
ApplyLauncherWakeup(void)
11111117
{
11121118
if (LogicalRepCtx->launcher_pid != 0)
@@ -1119,6 +1125,8 @@ ApplyLauncherWakeup(void)
11191125
void
11201126
ApplyLauncherMain(Datum main_arg)
11211127
{
1128+
bool slot_maybe_exist = true;
1129+
11221130
ereport(DEBUG1,
11231131
(errmsg_internal("logical replication launcher started")));
11241132

@@ -1147,6 +1155,8 @@ ApplyLauncherMain(Datum main_arg)
11471155
MemoryContext subctx;
11481156
MemoryContext oldctx;
11491157
long wait_time = DEFAULT_NAPTIME_PER_CYCLE;
1158+
bool can_advance_xmin = true;
1159+
FullTransactionId xmin = InvalidFullTransactionId;
11501160

11511161
CHECK_FOR_INTERRUPTS();
11521162

@@ -1166,15 +1176,56 @@ ApplyLauncherMain(Datum main_arg)
11661176
TimestampTz now;
11671177
long elapsed;
11681178

1179+
/*
1180+
* Create the conflict slot before starting the worker to prevent
1181+
* it from unnecessarily maintaining its oldest_nonremovable_xid.
1182+
*/
1183+
create_conflict_slot_if_not_exists();
1184+
11691185
if (!sub->enabled)
1186+
{
1187+
can_advance_xmin = false;
11701188
continue;
1189+
}
11711190

11721191
LWLockAcquire(LogicalRepWorkerLock, LW_SHARED);
11731192
w = logicalrep_worker_find(sub->oid, InvalidOid, false);
11741193
LWLockRelease(LogicalRepWorkerLock);
11751194

11761195
if (w != NULL)
1196+
{
1197+
/*
1198+
* Collect non-removable transaction IDs from all apply
1199+
* workers to determine the xmin for advancing the replication
1200+
* slot used in conflict detection.
1201+
*/
1202+
if (can_advance_xmin)
1203+
{
1204+
FullTransactionId nonremovable_xid;
1205+
1206+
SpinLockAcquire(&w->relmutex);
1207+
nonremovable_xid = w->oldest_nonremovable_xid;
1208+
SpinLockRelease(&w->relmutex);
1209+
1210+
/*
1211+
* Stop advancing xmin if an invalid non-removable
1212+
* transaction ID is found, otherwise update xmin.
1213+
*/
1214+
if (!FullTransactionIdIsValid(nonremovable_xid))
1215+
can_advance_xmin = false;
1216+
else if (!FullTransactionIdIsValid(xmin) ||
1217+
FullTransactionIdPrecedes(nonremovable_xid, xmin))
1218+
xmin = nonremovable_xid;
1219+
}
1220+
11771221
continue; /* worker is running already */
1222+
}
1223+
1224+
/*
1225+
* The worker has not yet started, so there is no valid
1226+
* non-removable transaction ID available for advancement.
1227+
*/
1228+
can_advance_xmin = false;
11781229

11791230
/*
11801231
* If the worker is eligible to start now, launch it. Otherwise,
@@ -1207,6 +1258,27 @@ ApplyLauncherMain(Datum main_arg)
12071258
}
12081259
}
12091260

1261+
/*
1262+
* Maintain the xmin value of the replication slot for conflict
1263+
* detection if needed.
1264+
*/
1265+
if (sublist)
1266+
{
1267+
if (can_advance_xmin)
1268+
advance_conflict_slot_xmin(xmin);
1269+
1270+
slot_maybe_exist = true;
1271+
}
1272+
1273+
/*
1274+
* Drop the slot if we're no longer retaining dead tuples.
1275+
*/
1276+
else if (slot_maybe_exist)
1277+
{
1278+
drop_conflict_slot_if_exists();
1279+
slot_maybe_exist = false;
1280+
}
1281+
12101282
/* Switch back to original memory context. */
12111283
MemoryContextSwitchTo(oldctx);
12121284
/* Clean the temporary memory. */
@@ -1234,6 +1306,110 @@ ApplyLauncherMain(Datum main_arg)
12341306
/* Not reachable */
12351307
}
12361308

1309+
/*
1310+
* Create and acquire the replication slot used to retain dead tuples for
1311+
* conflict detection, if not yet.
1312+
*/
1313+
static void
1314+
create_conflict_slot_if_not_exists(void)
1315+
{
1316+
TransactionId xmin_horizon;
1317+
1318+
/* Exit early if the replication slot is already created and acquired */
1319+
if (MyReplicationSlot)
1320+
return;
1321+
1322+
/* If the replication slot exists, acquire it and exit */
1323+
if (SearchNamedReplicationSlot(CONFLICT_DETECTION_SLOT, true))
1324+
{
1325+
ReplicationSlotAcquire(CONFLICT_DETECTION_SLOT, true, false);
1326+
return;
1327+
}
1328+
1329+
ReplicationSlotCreate(CONFLICT_DETECTION_SLOT, false,
1330+
RS_PERSISTENT, false, false, false);
1331+
1332+
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
1333+
1334+
xmin_horizon = GetOldestSafeDecodingTransactionId(false);
1335+
1336+
SpinLockAcquire(&MyReplicationSlot->mutex);
1337+
MyReplicationSlot->effective_xmin = xmin_horizon;
1338+
MyReplicationSlot->data.xmin = xmin_horizon;
1339+
SpinLockRelease(&MyReplicationSlot->mutex);
1340+
1341+
ReplicationSlotsComputeRequiredXmin(true);
1342+
1343+
LWLockRelease(ProcArrayLock);
1344+
1345+
/* Write this slot to disk */
1346+
ReplicationSlotMarkDirty();
1347+
ReplicationSlotSave();
1348+
}
1349+
1350+
/*
1351+
* Attempt to advance the xmin value of the replication slot used to retain
1352+
* dead tuples for conflict detection.
1353+
*/
1354+
static void
1355+
advance_conflict_slot_xmin(FullTransactionId new_xmin)
1356+
{
1357+
FullTransactionId full_xmin;
1358+
FullTransactionId next_full_xid;
1359+
1360+
Assert(MyReplicationSlot);
1361+
Assert(FullTransactionIdIsValid(new_xmin));
1362+
1363+
next_full_xid = ReadNextFullTransactionId();
1364+
1365+
/*
1366+
* Compute FullTransactionId for the current xmin. This handles the case
1367+
* where transaction ID wraparound has occurred.
1368+
*/
1369+
full_xmin = FullTransactionIdFromAllowableAt(next_full_xid,
1370+
MyReplicationSlot->data.xmin);
1371+
1372+
if (FullTransactionIdPrecedesOrEquals(new_xmin, full_xmin))
1373+
return;
1374+
1375+
SpinLockAcquire(&MyReplicationSlot->mutex);
1376+
MyReplicationSlot->data.xmin = XidFromFullTransactionId(new_xmin);
1377+
SpinLockRelease(&MyReplicationSlot->mutex);
1378+
1379+
/* first write new xmin to disk, so we know what's up after a crash */
1380+
1381+
ReplicationSlotMarkDirty();
1382+
ReplicationSlotSave();
1383+
elog(DEBUG1, "updated xmin: %u", MyReplicationSlot->data.xmin);
1384+
1385+
/*
1386+
* Now the new xmin is safely on disk, we can let the global value
1387+
* advance. We do not take ProcArrayLock or similar since we only advance
1388+
* xmin here and there's not much harm done by a concurrent computation
1389+
* missing that.
1390+
*/
1391+
SpinLockAcquire(&MyReplicationSlot->mutex);
1392+
MyReplicationSlot->effective_xmin = MyReplicationSlot->data.xmin;
1393+
SpinLockRelease(&MyReplicationSlot->mutex);
1394+
1395+
ReplicationSlotsComputeRequiredXmin(false);
1396+
1397+
return;
1398+
}
1399+
1400+
/*
1401+
* Drop the replication slot used to retain dead tuples for conflict detection,
1402+
* if it exists.
1403+
*/
1404+
static void
1405+
drop_conflict_slot_if_exists(void)
1406+
{
1407+
if (MyReplicationSlot)
1408+
ReplicationSlotDropAcquired();
1409+
else if (SearchNamedReplicationSlot(CONFLICT_DETECTION_SLOT, true))
1410+
ReplicationSlotDrop(CONFLICT_DETECTION_SLOT, true);
1411+
}
1412+
12371413
/*
12381414
* Is current process the logical replication launcher?
12391415
*/

src/backend/replication/logical/reorderbuffer.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -4787,7 +4787,7 @@ StartupReorderBuffer(void)
47874787
continue;
47884788

47894789
/* if it cannot be a slot, skip the directory */
4790-
if (!ReplicationSlotValidateName(logical_de->d_name, DEBUG2))
4790+
if (!ReplicationSlotValidateName(logical_de->d_name, true, DEBUG2))
47914791
continue;
47924792

47934793
/*

src/backend/replication/logical/worker.c

+3
Original file line numberDiff line numberDiff line change
@@ -4357,6 +4357,9 @@ wait_for_local_flush(RetainConflictInfoData *data)
43574357
LSN_FORMAT_ARGS(data->remote_lsn),
43584358
XidFromFullTransactionId(data->candidate_xid));
43594359

4360+
/* Notify launcher to update the xmin of the conflict slot */
4361+
ApplyLauncherWakeup();
4362+
43604363
/*
43614364
* Reset all data fields except those used to determine the timing for the
43624365
* next round of transaction ID advancement.

0 commit comments

Comments
 (0)