summaryrefslogtreecommitdiff
path: root/src/backend/storage/ipc/standby.c
diff options
context:
space:
mode:
authorHeikki Linnakangas2010-12-07 08:23:30 +0000
committerHeikki Linnakangas2010-12-07 08:23:30 +0000
commit5a031a5556ff83b8a9646892715d7fef415b83c3 (patch)
treef6d7421527d63a00f67e8e0370eac6b344c42c61 /src/backend/storage/ipc/standby.c
parent8b5692809707c0e15d04c530a3fed9347350ea01 (diff)
Fix bugs in the hot standby known-assigned-xids tracking logic. If there's
an old transaction running in the master, and a lot of transactions have started and finished since, and a WAL-record is written in the gap between the creating the running-xacts snapshot and WAL-logging it, recovery will fail with "too many KnownAssignedXids" error. This bug was reported by Joachim Wieland on Nov 19th. In the same scenario, when fewer transactions have started so that all the xids fit in KnownAssignedXids despite the first bug, a more serious bug arises. We incorrectly initialize the clog code with the oldest still running transaction, and when we see the WAL record belonging to a transaction with an XID larger than one that committed already before the checkpoint we're recovering from, we zero the clog page containing the already committed transaction, leading to data loss. In hindsight, trying to track xids in the known-assigned-xids array before seeing the running-xacts record was too complicated. To fix that, hold XidGenLock while the running-xacts snapshot is taken and WAL-logged. That ensures that no transaction can begin or end in that gap, so that in recvoery we know that the snapshot contains all transactions running at that point in WAL.
Diffstat (limited to 'src/backend/storage/ipc/standby.c')
-rw-r--r--src/backend/storage/ipc/standby.c11
1 files changed, 3 insertions, 8 deletions
diff --git a/src/backend/storage/ipc/standby.c b/src/backend/storage/ipc/standby.c
index 5e0d1d067e5..adf87a44c3d 100644
--- a/src/backend/storage/ipc/standby.c
+++ b/src/backend/storage/ipc/standby.c
@@ -671,7 +671,7 @@ StandbyReleaseAllLocks(void)
/*
* StandbyReleaseOldLocks
* Release standby locks held by XIDs < removeXid, as long
- * as their not prepared transactions.
+ * as they're not prepared transactions.
*/
void
StandbyReleaseOldLocks(TransactionId removeXid)
@@ -848,14 +848,9 @@ LogStandbySnapshot(TransactionId *oldestActiveXid, TransactionId *nextXid)
* record we write, because standby will open up when it sees this.
*/
running = GetRunningTransactionData();
-
- /*
- * The gap between GetRunningTransactionData() and
- * LogCurrentRunningXacts() is what most of the fuss is about here, so
- * artifically extending this interval is a great way to test the little
- * used parts of the code.
- */
LogCurrentRunningXacts(running);
+ /* GetRunningTransactionData() acquired XidGenLock, we must release it */
+ LWLockRelease(XidGenLock);
*oldestActiveXid = running->oldestRunningXid;
*nextXid = running->nextXid;