Re: sync_standbys_defined read/write race on startup

From: "Maksim(dot)Melnikov" <m(dot)melnikov(at)postgrespro(dot)ru>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Kirill Reshke <reshkekirill(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: sync_standbys_defined read/write race on startup
Date: 2025-04-10 07:12:34
Message-ID: [email protected]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 10.04.2025 05:25, Michael Paquier wrote:
> Confirmed. One thing where it would be possible to make things work
> is to introduce some persistency of the injection points, say these
> are flushed at shutdown. We could do that without touching at the
> backend code and only in the module injection_points, but perhaps this
> thread is not enough to justify this extra complexity.

Agree, it should be discussed in another thread.

On 10.04.2025 05:25, Michael Paquier wrote:
> What do you think?

Yes, patch for me looks ok, only one suggestion, I am not sure

but I am afraid we are loosing this.

        if (WalSndCtl->sync_standbys_status & SYNC_STANDBY_INIT)
        {
                if ((WalSndCtl->sync_standbys_status &
SYNC_STANDBY_DEFINED) == 0 ||
                        lsn <= WalSndCtl->lsn[mode])
                {
                        LWLockRelease(SyncRepLock);
                        return;
                }
        }
        else if (!SyncStandbysDefined())
        {
                /*
                 * If we are here, the sync standby data has not been
initialized yet,
                 * so fall back to the best thing we can do: a check on
                 * SyncStandbysDefined() to see if the GUC is set or not.
                 *
                 * If the GUC has a value, we wait until the
checkpointer updates the
                 * status data because we cannot be sure yet if we
should wait or not.
                 * If the GUC has *no* value, we are sure that there is
no point to
                 * wait; this matters for example when initializing a
cluster, where
                 * we should never wait, and no sync standbys is the
default behavior.
                 */
                LWLockRelease(SyncRepLock);
                return;
        }
+       else if (lsn <= WalSndCtl->lsn[mode])
+       {
+               LWLockRelease(SyncRepLock);
+               return;
+       }

What do you think?

Best regards,

Maksim Melnikov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dianjin Wang 2025-04-10 07:15:17 Re: Add .DS_Store to .gitignore
Previous Message Amit Langote 2025-04-10 06:28:55 Re: [PoC] Reducing planning time when tables have many partitions