Re: promotion related handling in pg_sync_replication_slots()

Поиск
Список
Период
Сортировка
От shveta malik
Тема Re: promotion related handling in pg_sync_replication_slots()
Дата
Msg-id CAJpy0uA-_rfARiNCfpZ=xdvyet-FpqmDRiiav4n9C=_f0Mcf8Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: promotion related handling in pg_sync_replication_slots()  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: promotion related handling in pg_sync_replication_slots()
Список pgsql-hackers
On Sat, Apr 6, 2024 at 11:49 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Please find v2. Changes are:
> > 1) Rebased the patch as there was conflict due to recent commit 6f132ed.
> > 2) Added an Assert in update_synced_slots_inactive_since() to ensure
> > that the slot does not have active_pid.
> > 3) Improved commit msg and comments.
> >
>
> Few comments:
> ==============
> 1.
>  void
>  SyncReplicationSlots(WalReceiverConn *wrconn)
>  {
> + /*
> + * Startup process signaled the slot sync to stop, so if meanwhile user
> + * has invoked slot sync SQL function, simply return.
> + */
> + SpinLockAcquire(&SlotSyncCtx->mutex);
> + if (SlotSyncCtx->stopSignaled)
> + {
> + ereport(LOG,
> + errmsg("skipping slot synchronization as slot sync shutdown is
> signaled during promotion"));
> +
> + SpinLockRelease(&SlotSyncCtx->mutex);
> + return;
> + }
> + SpinLockRelease(&SlotSyncCtx->mutex);
>
> There is a race condition with this code. Say during promotion
> ShutDownSlotSync() is just before setting this flag and the user has
> invoked pg_sync_replication_slots() and passed this check but still
> didn't set the SlotSyncCtx->syncing flag. So, now, the promotion would
> recognize that there is slot sync going on in parallel, and slot sync
> wouldn't know that the promotion is in progress.

Did you mean that now, the promotion *would not* recognize...

I see, I will fix this.

> The current coding for slot sync worker ensures there is no such race
> by checking the stopSignaled and setting the pid together under
> spinlock. I think we need to move the setting of the syncing flag
> similarly. Once we do that probably checking SlotSyncCtx->syncing
> should be sufficient in ShutDownSlotSync(). If we change the location
> of setting the 'syncing' flag then please ensure its cleanup as we
> currently do in slotsync_failure_callback().

Sure, let me review.

> 2.
> @@ -1395,6 +1395,7 @@ update_synced_slots_inactive_since(void)
>   if (s->in_use && s->data.synced)
>   {
>   Assert(SlotIsLogical(s));
> + Assert(s->active_pid == 0);
>
> We can add a comment like: "The slot must not be acquired by any process"
>
> --
> With Regards,
> Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: post-freeze damage control
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Add SPLIT PARTITION/MERGE PARTITIONS commands