Re: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От shveta malik
Тема Re: Synchronizing slots from primary to standby
Дата
Msg-id CAJpy0uDNdnZHh0WXQC4xH=3zaV3ePxAu3-1OqW14jgqeuE=Okg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  (Bertrand Drouvot <bertranddrouvot.pg@gmail.com>)
Ответы Re: Synchronizing slots from primary to standby
Список pgsql-hackers
On Fri, Apr 5, 2024 at 9:22 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote:
> > On Thu, Apr 4, 2024 at 2:59 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > >
> > > Prior to commit 2ec005b, this check was okay, as we did not expect
> > > restart_lsn of the synced slot to be ahead of remote since we were
> > > directly copying the lsns. But now when we use 'advance' to do logical
> > > decoding on standby, there is a possibility that restart lsn of the
> > > synced slot is ahead of remote slot, if there are running txns records
> > > found after reaching consistent-point while consuming WALs from
> > > restart_lsn till confirmed_lsn. In such a case, slot-sync's advance
> > > may end up serializing snapshots and setting restart_lsn to the
> > > serialized snapshot point, ahead of remote one.
> > >
> > > Fix:
> > > The sanity check needs to be corrected. Attached a patch to address the issue.
> >
>
> Thanks for reporting, explaining the issue and providing a patch.
>
> Regarding the patch:
>
> 1 ===
>
> +        * Attempt to sync lsns and xmins only if remote slot is ahead of local
>
> s/lsns/LSNs/?
>
> 2 ===
>
> +                       if (slot->data.confirmed_flush != remote_slot->confirmed_lsn)
> +                               elog(LOG,
> +                                        "could not synchronize local slot \"%s\" LSN(%X/%X)"
> +                                        " to remote slot's LSN(%X/%X) ",
> +                                        remote_slot->name,
> +                                        LSN_FORMAT_ARGS(slot->data.confirmed_flush),
> +                                        LSN_FORMAT_ARGS(remote_slot->confirmed_lsn));
>
> I don't think that the message is correct here. Unless I am missing something
> there is nothing in the following code path that would prevent the slot to be
> sync during this cycle.

This is a sanity check,  I will put a comment to indicate the same. We
want to ensure if anything changes in future, we get correct logs to
indicate that.
If required, the LOG msg can be changed. Kindly suggest if you have
anything better in mind.

thanks
Shveta



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Erik Wienhold
Дата:
Сообщение: Re: IPC::Run::time[r|out] vs our TAP tests
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: Streaming read-ready sequential scan code