Re: Implement waiting for wal lsn replay: reloaded
| От | Xuneng Zhou |
|---|---|
| Тема | Re: Implement waiting for wal lsn replay: reloaded |
| Дата | |
| Msg-id | CABPTF7Ukk8iJF7TpnK2mFOaboNJgWL1csfXu4e3J4GT0o7x0GQ@mail.gmail.com обсуждение |
| Ответ на | Re: Implement waiting for wal lsn replay: reloaded (Xuneng Zhou <xunengzhou@gmail.com>) |
| Ответы |
Re: Implement waiting for wal lsn replay: reloaded
|
| Список | pgsql-hackers |
On Fri, Apr 10, 2026 at 11:59 AM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > On Fri, Apr 10, 2026 at 12:18 AM Andres Freund <andres@anarazel.de> wrote: > > > > On 2026-04-09 18:21:24 +0300, Alexander Korotkov wrote: > > > I've assembled all the pending patches together. > > > 0001 adds memory barrier to GetWalRcvWriteRecPtr() as suggested by > > > Andres off-list. > > > > I'd make it a pg_atomic_read_membarrier_u64(). > > > > > > > 0002 is basically [1] by Xuneng, but revised given we have a memory > > > barrier in 0001, and my proposal to do ResetLatch() unconditionally > > > similar to our other Latch-based loops. > > > 0003 and 0004 are [2] by Xuneng. > > > 0005 is [3] by Xuneng. > > > > > > I'm going to add them to Commitfest to run CI over them, and have a > > > closer look over them tomorrow. > > > > Briefly skimming the patches, none makes the writes to writtenUpto use > > something bearing barrier semantics. I'd just make both of them a > > pg_atomic_write_membarrier_u64(). > > > > Makes sense to me. Done. > > > I think this also needs a few more tests, e.g. for the scenario that > > 29e7dbf5e4d fixed. I think it'd also be good to do some testing for > > off-by-one dangers. E.g. making sure that we don't stop waiting too early / > > too late. Another one that I think might deserve more testing is waits on the > > standby while crossing timeline boundaries. > > > > I'll prepare a new patch for more test harnessing. > > > > > > From 0e5b4d1b9311a628a70218d89abf12308c9d782f Mon Sep 17 00:00:00 2001 > > > From: Alexander Korotkov <akorotkov@postgresql.org> > > > Date: Thu, 9 Apr 2026 16:49:04 +0300 > > > Subject: [PATCH v3 1/5] Add a memory barrier to GetWalRcvWriteRecPtr() > > > > > > Add pg_memory_barrier() before reading writtenUpto so that callers see > > > up-to-date shared memory state. This matches the barrier semantics that > > > GetWalRcvFlushRecPtr() and other LSN-position functions get implicitly from > > > their spinlock acquire/release, and in turn protects from bugs caused by > > > expectations of similar barrier guarantees from different LSN-position functions. > > > > > > Reported-by: Andres Freund <andres@anarazel.de> > > > Discussion: https://postgr.es/m/zqbppucpmkeqecfy4s5kscnru4tbk6khp3ozqz6ad2zijz354k%40w4bdf4z3wqoz > > > --- > > > src/backend/replication/walreceiverfuncs.c | 12 ++++++++++-- > > > 1 file changed, 10 insertions(+), 2 deletions(-) > > > > > > diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c > > > index bd5d47be964..0408ddff43e 100644 > > > --- a/src/backend/replication/walreceiverfuncs.c > > > +++ b/src/backend/replication/walreceiverfuncs.c > > > @@ -363,14 +363,22 @@ GetWalRcvFlushRecPtr(XLogRecPtr *latestChunkStart, TimeLineID *receiveTLI) > > > > > > /* > > > * Returns the last+1 byte position that walreceiver has written. > > > - * This returns a recently written value without taking a lock. > > > + * > > > + * Use a memory barrier to ensure that callers see up-to-date shared memory > > > + * state, matching the barrier semantics provided by the spinlock in > > > + * GetWalRcvFlushRecPtr() and other LSN-position functions. > > > */ > > > XLogRecPtr > > > GetWalRcvWriteRecPtr(void) > > > { > > > WalRcvData *walrcv = WalRcv; > > > + XLogRecPtr recptr; > > > + > > > + pg_memory_barrier(); > > > > > > - return pg_atomic_read_u64(&walrcv->writtenUpto); > > > + recptr = pg_atomic_read_u64(&walrcv->writtenUpto); > > > + > > > + return recptr; > > > } > > > > > > /* > > > -- > > > 2.39.5 (Apple Git-154) > > > > > > > > Subject: [PATCH v3 2/5] Fix memory ordering in WAIT FOR LSN wakeup mechanism > > > > > + /* > > > + * Ensure the waker's prior position store (writtenUpto, flushedUpto, > > > + * lastReplayedEndRecPtr, etc.) is globally visible before we read > > > + * minWaitedLSN. Without this barrier, the CPU could load minWaitedLSN > > > + * before draining the position store, leaving the position invisible to a > > > + * concurrently-registering waiter. > > > + * > > > + * This is the waker side of a Dekker-style handshake; pairs with > > > + * pg_memory_barrier() in GetCurrentLSNForWaitType() on the waiter side. > > > + */ > > > + pg_memory_barrier(); > > > + > > > /* > > > * Fast path check. Skip if currentLSN is InvalidXLogRecPtr, which means > > > * "wake all waiters" (e.g., during promotion when recovery ends). > > > > I'd also make this a pg_atomic_read_membarrier_u64() and the write a > > pg_atomic_write_membarrier_u64(). It's a lot easier to reason about this > > stuff if you make sure that the individual reads / write pair and have > > ordering implied. > > > > It does look more selft-contained to me. > > Here is the updated patch set based on Alexander’s earlier version. The last para of commit message in patch 2 is inaccurate after the getter-side barrier changes, "could read a stale position and wrongly timeout" is no longer the primary rationale. -- Best, Xuneng
Вложения
- v4-0001-Use-barrier-semantics-when-reading-writing-writte.patch
- v4-0005-Wake-standby_write-standby_flush-waiters-from-the.patch
- v4-0004-Use-replay-position-as-floor-for-WAIT-FOR-LSN.patch
- v4-0003-Remove-redundant-WAIT-FOR-LSN-caller-side-pre-che.patch
- v4-0002-Fix-memory-ordering-in-WAIT-FOR-LSN-wakeup-mechan.patch
В списке pgsql-hackers по дате отправления: