Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting

Поиск

Список

Период

Сортировка

От	Michael Paquier
Тема	Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
Дата	6 октября 05:43:11
Msg-id	aOMsv9TszlB1n-W7@paquier.xyz обсуждение исходный текст
Ответ на	Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting (Xuneng Zhou <xunengzhou@gmail.com>)
Ответы	Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting
Список	pgsql-hackers

Дерево обсуждения

On Sat, Oct 04, 2025 at 03:21:07PM +0800, Xuneng Zhou wrote:
> v6 refactors and extends the infrastructure from the WAIT FOR command
> patch, applying it to read_local_xlog_page_guts. I'm also thinking of
> creating a standalone patch/commit for the extended
> infra in xlogwait, so it can be reused in different threads.

Yes, I think that you should split your patch where you think that it
can make review easier, because your change touches a very sensitive
area of the code base:
- First patch tointroduce what you consider is the basic
infrastructure required for your patch, that can be shared between
multiple pieces.  I doubt that you really need to have everything's
that in waitlsn.c to achieve what you want here.
- Second patch to introduce your actual feature, to make the callback
more responsive.
- Then, potentially have a third patch, that adds pieces of
infrastructure to waitlsn.c that you did not need in the first patch,
still are required for the waitlsn.c thread.  It would be optionally
possible to rebase the waitlsn patch to use patches 1 and 3, then.

I'd even try to consider the problem from the angle of looking for
independent pieces that could be extracted from the first patch and
split as other patches, to ease even again the review.  There is a
limit to this idea because you need a push/pull/reporting facility for
a flush LSN and a replay LSN depending on if you are on a primary, on
a standby, and even another case where you are dealing with a promoted
standby where you decide to loop back *inside* the callback (which I
suspect may not be always the right thing to do depending on the new
TLI selected), so there is a limit in what could be treated as an
independent piece.  At least the bits about pairingheap_initialize()
may be worth considering.

+                       if (waitLSNState &&
+                           (XLogRecoveryCtl->lastReplayedEndRecPtr >=
+                            pg_atomic_read_u64(&waitLSNState->minWaitedReplayLSN)))
+                           WaitLSNWakeupReplay(XLogRecoveryCtl->lastReplayedEndRecPtr);

This code pattern looks like a copy-paste of what's done in
synchronous replication.  Has some consolidation between syncrep.c and
this kind of facility ever been considered?  In terms of queues, waits
and wakeups, the requirements are pretty similar, still your patch has
zero changes related to syncrep.c or syncrep.h.

As far as I can see based on your patch, you are repeating some of the
mistakes of the wait LSN patch, where I've complained about
WaitForLSNReplay() and the duplication it had.  One thing you have
decided to pull for example is duplicated calls to
GetXLogReplayRecPtr().
--
Michael

Вложения

signature.asc

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Improve read_local_xlog_page_guts by replacing polling with latch-based waiting

Вложения