RE: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От Zhijie Hou (Fujitsu)
Тема RE: Synchronizing slots from primary to standby
Дата
Msg-id OS0PR01MB57166472DB3D01F7407259179496A@OS0PR01MB5716.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на RE: Synchronizing slots from primary to standby  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Ответы RE: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby
Список pgsql-hackers
On Wednesday, December 20, 2023 4:03 PM Kuroda, Hayato/黒田 隼人 <kuroda.hayato@fujitsu.com> wrote:

Hi,

> 
> Dear Amit, Shveta,
> 
> > > walsender.c
> > >
> > > 01. WalSndWaitForStandbyConfirmation
> > >
> > > ```
> > > +        sleeptime =
> WalSndComputeSleeptime(GetCurrentTimestamp());
> > > ```
> > >
> > > It works well, but I'm not sure whether we should use
> > WalSndComputeSleeptime()
> > > because the function won't be called by walsender.
> > >
> >
> > I don't think it is correct to use this function because it is
> > walsender specific, for example, it uses 'last_reply_timestamp' which
> > won't be even initialized in the backend environment. We need to
> > probably use a different logic for sleep here or need to use a
> > hard-coded value.
> 
> Oh, you are right. I haven't look until the func.
> 
> > I think we should change the name of functions like
> > WalSndWaitForStandbyConfirmation() as they are no longer used by
> > walsender. IIRC, earlier, we had a common logic to wait from both
> > walsender and SQL APIs which led to this naming but that is no longer
> > true with the latest patch.
> 
> How about "WaitForStandbyConfirmation", which is simpler? There are some
> functions like "WaitForParallelWorkersToFinish", "WaitForProcSignalBarrier"
> and so on.

Thanks for the comments. I think WaitForStandbyConfirmation is OK.
And I removed the WalSnd prefix for these functions and move them to
slot.c where the standby_slot_names is declared.

> 
> > > 02.WalSndWaitForStandbyConfirmation
> > >
> > > ```
> > > +        ConditionVariableTimedSleep(&WalSndCtl->wal_confirm_rcv_cv,
> > sleeptime,
> > > +
> > WAIT_EVENT_WAL_SENDER_WAIT_FOR_STANDBY_CONFIRMATION)
> > > ```
> > >
> > > Hmm, is it OK to use the same event as WalSndWaitForWal()? IIUC it
> > > should be
> > avoided.
> > >
> >
> > Agreed. So, how about using
> > WAIT_EVENT_WAIT_FOR_STANDBY_CONFIRMATION
> > so that we can use it both from the backend and walsender?
> 
> Seems right. Note again that a description of .txt file must be also fixed.

Changed.

> 
> Anyway, further comments on v50-0001.
> 
> ~~~~~
> protocol.sgml
> 
> 01. create_replication_slot
> 
> ```
> +       <varlistentry>
> +        <term><literal>FAILOVER { 'true' | 'false' }</literal></term>
> +        <listitem>
> +         <para>
> +          If true, the slot is enabled to be synced to the physical
> +          standbys so that logical replication can be resumed after failover.
> +         </para>
> +        </listitem>
> +       </varlistentry>
> ```
> 
> IIUC, the true/false is optional. libpqwalreceiver does not add the boolean.
> Also you can follow the notation of `TWO_PHASE`.

Changed.

> 
> 02. alter_replication_slot
> 
> ```
> +      <variablelist>
> +       <varlistentry>
> +        <term><literal>FAILOVER { 'true' | 'false' }</literal></term>
> +        <listitem>
> +         <para>
> +          If true, the slot is enabled to be synced to the physical
> +          standbys so that logical replication can be resumed after failover.
> +         </para>
> +        </listitem>
> +       </varlistentry>
> +      </variablelist>
> ```
> 
> Apart from above, this boolean is mandatory, right?
> But you can follow other notation.
> 

Right, changed it to optional to be consistent with others.

> 
> ~~~~~~~
> slot.c
> 
> 03. validate_standby_slots
> 
> ```
> +    /* Need a modifiable copy of string. */
> ...
> +    /* Verify syntax and parse string into a list of identifiers. */
> ```
> 
> Unnecessary comma?

You mean comma or period ? I think the current style is OK.

> 
> 04. validate_standby_slots
> 
> ```
> +    if (!ok || !ReplicationSlotCtl)
> +    {
> +        pfree(rawname);
> +        list_free(elemlist);
> +        return ok;
> +    }
> ```
> 
> It may be more efficient to exit earlier when ReplicationSlotCtl is NULL.

I think even if ReplicationSlotCtl is NULL, we still need to check the syntax
of the slot names.

> 
> ~~~~~~~
> walsender.c
> 
> 05. PhysicalWakeupLogicalWalSnd
> 
> ```
> +/*
> + * Wake up the logical walsender processes with failover-enabled slots
> +if the
> + * physical slot of the current walsender is specified in
> +standby_slot_names
> + * GUC.
> + */
> +void
> +PhysicalWakeupLogicalWalSnd(void)
> ```
> 
> The function can be called from backend processes, but you said "the current
> walsender"
> in the comment.

Changed the words.

> 
> 06. WalSndRereadConfigAndReInitSlotList
> 
> ```
> +    char       *pre_standby_slot_names;
> +
> +    ProcessConfigFile(PGC_SIGHUP);
> +
> +    /*
> +     * If we are running on a standby, there is no need to reload
> +     * standby_slot_names since we do not support syncing slots to
> cascading
> +     * standbys.
> +     */
> +    if (RecoveryInProgress())
> +        return;
> +
> +    pre_standby_slot_names = pstrdup(standby_slot_names);
> ```
> 
> I felt that we must preserve pre_standby_slot_names before calling
> ProcessConfigFile().
> 

Good catch. Fixed.

> 
> 07. WalSndFilterStandbySlots
> 
> I felt the prefix "WalSnd" may not be needed because both backend processes
> and walsender will call the function.

Right, renamed.

Attach the V51 patch set which addressed Kuroda-san's comments.
I also tried to improve the test in 0003 to make it stable.

Best Regards,
Hou zj

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)
Следующее
От: "Zhijie Hou (Fujitsu)"
Дата:
Сообщение: RE: Synchronizing slots from primary to standby