Re: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От shveta malik
Тема Re: Synchronizing slots from primary to standby
Дата
Msg-id CAJpy0uCpwShc76zBen4rJsL4FOc3KxC8J8dEuW-KG=a31J6Nfg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Synchronizing slots from primary to standby  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Tue, Feb 20, 2024 at 8:25 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
>
> I've reviewed the v91 patch. Here are random comments:

Thanks for the comments.

> ---
>  /*
>   * Checks the remote server info.
>   *
> - * We ensure that the 'primary_slot_name' exists on the remote server and the
> - * remote server is not a standby node.
> + * Check whether we are a cascading standby. For non-cascading standbys, it
> + * also ensures that the 'primary_slot_name' exists on the remote server.
>   */
>
> IIUC what the validate_remote_info() does doesn't not change by this
> patch, so the previous comment seems to be clearer to me.
>
> ---
>     if (remote_in_recovery)
> +   {
> +       /*
> +        * If we are a cascading standby, no need to check further for
> +        * 'primary_slot_name'.
> +        */
>         ereport(ERROR,
>                 errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
>                 errmsg("cannot synchronize replication slots from a
> standby server"));
> +   }
> +   else
> +   {
> +       bool        primary_slot_valid;
>
> -   primary_slot_valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull));
> -   Assert(!isnull);
> +       primary_slot_valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull));
> +       Assert(!isnull);
>
> -   if (!primary_slot_valid)
> -       ereport(ERROR,
> -               errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> -               errmsg("bad configuration for slot synchronization"),
> -       /* translator: second %s is a GUC variable name */
> -               errdetail("The replication slot \"%s\" specified by
> \"%s\" does not exist on the primary server.",
> -                         PrimarySlotName, "primary_slot_name"));
> +       if (!primary_slot_valid)
> +           ereport(ERROR,
> +                   errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> +                   errmsg("bad configuration for slot synchronization"),
> +           /* translator: second %s is a GUC variable name */
> +                   errdetail("The replication slot \"%s\" specified
> by \"%s\" does not exist on the primary server.",
> +                             PrimarySlotName, "primary_slot_name"));
> +   }
>
> I think it's a refactoring rather than changes required by the
> slotsync worker. We can do that in a separate patch but why do we need
> this change in the first place?

In v90, this refactoring was made due to the fact that
validate_remote_info() was supposed to behave differently for SQL
function and slot-sync worker. SQL-function was supposed to ERROR out
while the worker was supposed to LOG and become no-op. And thus the
change was needed. In v91, we made this functionality same i.e. both
sql function and worker will error out but missed to remove this
refactoring. Thanks for catching this, I will revert it in the next
version. To match the refactoring, I made the comment change too, will
revert that as well.

> ---
> +        ValidateSlotSyncParams(ERROR);
> +
>          /* Load the libpq-specific functions */
>          load_file("libpqwalreceiver", false);
>
> -        ValidateSlotSyncParams();
> +        (void) CheckDbnameInConninfo();
>
> Is there any reason why we move where to check the parameters?

Earlier DBname verification was done inside ValidateSlotSyncParams()
and thus it was needed to load 'libpqwalreceiver' before we call this
function. Now we have moved dbname verification in a separate call and
thus the above order got changed. ValidateSlotSyncParams() is a common
function used by SQL function and worker. Earlier slot sync worker was
checking all the GUCs after starting up and was exiting each time any
GUC was invalid. It was suggested that it would be better to check the
GUCs before starting the slot sync worker in the postmaster itself,
making the ValidateSlotSyncParams() move to postmaster (see
SlotSyncWorkerAllowed).  But it was not a good idea to load libpq in
postmaster and thus we moved libpq related verification out of
ValidateSlotSyncParams(). This resulted in the above change.  I hope
it answers your query.

thanks
Shveta



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrei Lepikhov
Дата:
Сообщение: Re: POC, WIP: OR-clause support for indexes
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Synchronizing slots from primary to standby