Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput
Дата
Msg-id 1878165.1752858390@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Proposal: Out-of-Order NOTIFY via GUC to Improve LISTEN/NOTIFY Throughput  ("Joel Jacobson" <joel@compiler.org>)
Список pgsql-hackers
"Joel Jacobson" <joel@compiler.org> writes:
> My patch improves NOTIFY TPS when many backends are listening on multiple
> channels by eliminating unnecessary syscall wake‑ups, but it doesn't increase
> the internal parallelism of the NOTIFY queue itself.

After thinking about this for awhile, I have a rough idea of
something we could do to improve parallelism of NOTIFY.
As a bonus, this'd allow processes on hot standby servers to
receive NOTIFYs from processes on the primary, which is a
feature many have asked for.

The core thought here was to steal some implementation ideas
from two-phase commit.  I initially thought maybe we could
remove the SLRU queue entirely, and maybe we can still find
a way to do that, but in this sketch it's still there with
substantially reduced traffic.

The idea basically is to use the WAL log rather than SLRU
as transport for notify messages.

1. In PreCommit_Notify(), gather up all the notifications this
transaction has emitted, and write them into a WAL log message.
Remember the LSN of this message.  (I think this part should be
parallelizable, because of work that's previously been done to
allow parallel writes to WAL.)

2. When writing the transaction's commit WAL log entry, include
the LSN of the previous notify-data entry.

3. Concurrently with writing the commit entry, send a message
to the notify SLRU queue.  This would be a small fixed-size
message with the transaction's XID, database ID, and the LSN
of the notify-data WAL entry.  (The DBID is there to let
listeners quickly ignore traffic from senders in other DBs.)

4. Signal listening backends to check the queue, as we do now.

5. Listeners read the SLRU queue and then, if in same database,
pull the notify data out of the WAL log.  (I believe we already
have enough infrastructure to make that cheap, because 2-phase
commit does it too.)

In the simplest implementation of this idea, step 3 would still
require a global lock, to ensure that SLRU entries are made in
commit order.  However, that lock only needs to be held for the
duration of step 3, which is much shorter than what happens now.

A more advanced idea could be to send the SLRU message in step 1, as
soon as we've pushed out the notify-data message.  In this approach,
listening backends would become responsible for figuring out whether
senders have committed yet and processing the messages in correct
commit order.  This is quite handwavy yet because I don't have a
clear idea of how they'd do that reliably, but maybe it's possible.

In a hot standby server, the WAL replay process would simply have to
send the proper SLRU message and issue signals when it sees a commit
message containing a notify-data LSN.  (One small detail to be worked
out is who's responsible for truncating the notify SLRU in a hot
standby server.  In current usage the sending backends do it, but
there won't be any in hot standby, and there aren't necessarily any
listeners either.)

An area that needs a bit of thought is how to ensure that we don't
truncate away WAL that contains still-unread notify messages.
We have mechanisms already to prevent too-soon truncation of WAL,
so I doubt there's anything too difficult here.  (Also note that
we have an existing unsolved problem of preventing CLOG truncation
while the notify SLRU still contains references to some old XID.
Perhaps that could be dealt with at the same time.)

This isn't something I'm likely to work on anytime soon, but
perhaps someone else would like to push on these ideas.

            regards, tom lane



В списке pgsql-hackers по дате отправления: