Re: [PATCH] Write Notifications Through WAL

Поиск
Список
Период
Сортировка
От Rishu Bagga
Тема Re: [PATCH] Write Notifications Through WAL
Дата
Msg-id CAK80=jieMJ4tNEBQwipB+Q472fZP0YEf=EncMxoGqMP70E4uLQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PATCH] Write Notifications Through WAL  (Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>)
Список pgsql-hackers
Hi, apologies for the delay, attached a new patch addressing the
concerns, as well as
enabling a standby to listen and receive notifications from the primary.

On Fri, Oct 17, 2025 at 6:33 AM Arseniy Mukhin
<arseniy.mukhin.dev@gmail.com> wrote:

> > Good catch, I've fixed this in the new patch.
>
> Hmmm, looks like we still have the same problem in the new version if
> I'm not missing something:
>
>     Snapshot    snap = ActiveSnapshotSet() ? GetActiveSnapshot() : NULL;
>
>     ...
>
>     if (qe->dboid == MyDatabaseId && snap != NULL &&
>               XidInMVCCSnapshot(qe->xid, snap))
>
> Here we almost always have snap = NULL (except when we are in
> Exec_ListenPreCommit maybe), since listeners read notifications when
> they are out of active transaction. Why do we need snap nullability
> here? IIUC we can do XidInMVCCSnapshot check the same way as we do it
> in master.
>
> Also the comment
>
>     /*----------
>      * Get snapshot we'll use to decide which xacts are still in progress.
>      * This is trickier than it might seem, because of race conditions.
>
> seems to still be relevant, so I think it's worth keeping.
>
>

Reverted to the original logic and restored the comment in the new patch.

>
> Looks like you have solved both issues with it. Several points about
> wal pin logic:
>
> 1) I noticed that we can block wal truncating sometimes.
> NotifyPageMins array works great when we have a consistent flow of
> notifications and can truncate old queue segments regularly. But what
> if we don't have it? Please try the next example:
>
> Send a notification first:
>
>     NOTIFY ch, 'a';
>
> Then do something that makes wal grows.
>
>     create table if not exists t1 (a text);
>
>     insert into t1 (a) select gen_random_uuid()
>     from generate_series(1,10000000);
>
> You will see that wal grows without truncation. It seems that to solve
> the issue we should not consider queue entries before the queue tail
> in AsyncNotifyOldestRequiredLSN as we don't need their data anymore.

In the new patch, I updated the logic so that we update the min lsn on the
entry for the tail page in the page min array if needed, at time of
advancing the queue tail.

> 2) I'm worried if it's ok to iterate like this:
>
>     /* Scan per-backend pins under ProcArrayLock */
>         LWLockAcquire(ProcArrayLock, LW_SHARED);
>         for (int i = 0; i < MaxBackends; i++)
>         {
>            PGPROC *proc = GetPGProcByNumber(i);
>
> We use MaxBackends here, so it looks like we can access PGPROC that
> isn't assigned to the existing backend. I failed to find any examples
> in the code that would do the same. Maybe we don't need to deal with
> the MyProc structure here at all? We can do the same thing that we do
> for listener positions. Store it in QueueBackendStatus and directly
> access it with MyProcNumber when we want to update it for our backend.
> And in AsyncNotifyOldestRequiredLSN we can iterate over backends the
> same way we do it asyncQueueAdvanceTail. What do you think?

I don't see why iterating through MyProc is dangerous here? As for
accessing the PGPROC that isn't assigned, I added a check to skip
unused slots, with a simple "if (proc->pid == 0)". The issue with
using QueueBackendStatus is that we are currently using it to track
the status of listeners, so although we could add information about the
backend's notifying status, this might be a confusing change in the semantics.
We already need to store the notify data lsn in the proc structure, to write the
log record, so to me it makes sense to just look there, and not
duplicate that information.

>
> 3) It seems to me that we can miss some notification data record's lsn
> here AsyncNotifyOldestRequiredLSN. Let's consider the next schedule:
>
> Transaction                                     Checkpointer
>
> Notify transaction writes notification data wal
> record and updates its notifyDataLsn
>
>                                     AsyncNotifyOldestRequiredLSN call.
>                                     We iterate over NotifyPageMins. It
>                                     doesn't contain transaction's
>                                     notifyDataLsn yet.
>
> Transaction is committed, NotifyPageMins is updated
> notifyDataLsn is cleared.
>
>                                     We iterate over notifyDataLsns. Again
>                                     we don't see notifyDataLsn here as it
>                                     has already been removed.
>
> Is it real or I missed something? I see there was the lock in v6 that
> could prevent it, but probably we can just swap NotifyPageMins and
> notifyDataLsns parts in AsyncNotifyOldestRequiredLSN to solve the
> problem?
>

Good catch, yes I think swapping the order of the check to the uncommitted
min lsn and then committed min lsn fixes this, updated in the new patch.

Вложения

В списке pgsql-hackers по дате отправления: