Re: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От shveta malik
Тема Re: Synchronizing slots from primary to standby
Дата
Msg-id CAJpy0uCaUFpzYVdCsPzNQmJMFaNGEE16wbSwWS-a=+iyX+Y3cQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
On Wed, Dec 20, 2023 at 12:02 PM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some comments for the patch v50-0002.

Thank You for the feedback. I have addressed these in v52.

> ======
> GENERAL
>
> (I made a short study of all the ereports in this patch -- here are
> some findings)
>
> ~~~
>
> 0.1 Don't need the parentheses.
>
> Checking all the ereports I see that half of them have the redundant
> parentheses and half of them do not; You might as well make them all
> use the new style where the extra parentheses are not needed.
>
> e.g.
> + ereport(LOG,
> + (errmsg("skipping slot synchronization"),
> + errdetail("enable_syncslot is disabled.")));
>
> e.g.
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("cannot drop replication slot \"%s\"", name),
> + errdetail("This slot is being synced from the primary server.")));
>
> and many more like this. Search for all the ereports.
>
> ~~~
>
> 0.2
> + ereport(LOG,
> + (errmsg("dropped replication slot \"%s\" of dbid %d as it "
> + "was not sync-ready", NameStr(s->data.name),
> + s->data.database)));
>
> I felt maybe that could be:
>
> errmsg("dropped replication slot \"%s\" of dbid %d", ...
> errdetail("It was not sync-ready.")
>
> (now this shares the same errmsg with another ereport)
>
> ~~~
>
> 0.3.
> + ereport(ERROR,
> + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> + errmsg("skipping sync of slot \"%s\" as it is a user created"
> + " slot", remote_slot->name),
> + errdetail("This slot has failover enabled on the primary and"
> +    " thus is sync candidate but user created slot with"
> +    " the same name already exists on the standby.")));
>
> This seemed too wordy. Can't it be shortened (maybe like below)
> without losing any of the vital information?
>
> errmsg("skipping sync of slot \"%s\"", ...)
> errdetail("A user-created slot with the same name already exists on
> the standby.")

I have modified it a little bit more. Please see now. I wanted to add
the info that slot-sync worker is exiting instead of skipping a slot
and that the concerned slot is a failover slot on primary. These were
the other comments around the same.

> ~~~
>
> 0.4
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("exiting from slot synchronization due to bad configuration"),
> + /* translator: second %s is a GUC variable name */
> + errdetail("The primary slot \"%s\" specified by %s is not valid.",
> +    PrimarySlotName, "primary_slot_name")));
>
> /The primary slot/The primary server slot/
>
> ~~~
>
> 0.5
> + ereport(ERROR,
> + (errmsg("could not fetch primary_slot_name \"%s\" info from the "
> + "primary: %s", PrimarySlotName, res->err)));
>
> /primary:/primary server:/
>
> ~~~
>
> 0.6
> The continuations for long lines are inconsistent. Sometimes there are
> trailing spaces and sometimes there are leading spaces. And sometimes
> there are both at the same time which would cause double-spacing in
> the message! Please make them all the same. I think using leading
> spaces is easier but YMMV.
>
> e.g.
> + elog(ERROR,
> + "not synchronizing local slot \"%s\" LSN(%X/%X)"
> + " to remote slot's LSN(%X/%X) as synchronization "
> + " would move it backwards", remote_slot->name,
> + LSN_FORMAT_ARGS(slot->data.restart_lsn),
> + LSN_FORMAT_ARGS(remote_slot->restart_lsn));
>
> ======
> src/backend/replication/logical/slotsync.c
>
> 1. check_primary_info
>
> + /* No need to check further, return that we are cascading standby */
> + if (remote_in_recovery)
> + {
> + *am_cascading_standby = true;
> + ExecClearTuple(tupslot);
> + walrcv_clear_result(res);
> + CommitTransactionCommand();
> + return;
> + }
> +
> + valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull));
> + Assert(!isnull);
> +
> + if (!valid)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("exiting from slot synchronization due to bad configuration"),
> + /* translator: second %s is a GUC variable name */
> + errdetail("The primary slot \"%s\" specified by %s is not valid.",
> +    PrimarySlotName, "primary_slot_name")));
> + ExecClearTuple(tupslot);
> + walrcv_clear_result(res);
> + CommitTransactionCommand();
> +}
>
> Now that there is a common cleanup/return code this function be
> reduced further like below:
>
> SUGGESTION
>
> if (remote_in_recovery)
> {
>   /* No need to check further, return that we are cascading standby */
>   *am_cascading_standby = true;
> }
> else
> {
>   /* We are a normal standby. */
>
>   valid = DatumGetBool(slot_getattr(tupslot, 2, &isnull));
>   Assert(!isnull);
>
>   if (!valid)
>     ...
> }
>
> ExecClearTuple(tupslot);
> walrcv_clear_result(res);
> CommitTransactionCommand();
> }
>
> ~~~
>
> 2. ReplSlotSyncWorkerMain
>
> + /*
> + * One can promote the standby and we can no longer be a cascading
> + * standby. So recheck here.
> + */
> + if (am_cascading_standby)
> + check_primary_info(wrconn, &am_cascading_standby);
>
> Minor rewording of that new comment.
>
> SUGGESTION
> If the standby was promoted then what was previously a cascading
> standby might no longer be one, so recheck each time.
>
> ======
> src/test/recovery/t/050_verify_slot_order.pl
>
> 3.
> +##################################################
> +# Test that a synchronized slot can not be decoded, altered and
> dropped by the user
> +##################################################
>
> /and dropped/or dropped/
>
> ~~~
>
> 4.
> +
> +($result, $stdout, $stderr) = $standby1->psql(
> +    'postgres',
> +    qq[ALTER_REPLICATION_SLOT lsub1_slot (failover);],
> +    replication => 'database');
> +ok($stderr =~ /ERROR:  cannot alter replication slot "lsub1_slot"/,
> + "synced slot on standby cannot be altered");
> +
>
> Add a comment for this test  part
>
> SUGGESTION
> Attempting to alter a synced slot should result in an error
>
> ~~~
>
> 5.
> IMO it would be better if the tests were done in the same order
> mentioned in the comment. So either change the tests or change the
> comment.
>
> ======
> Kind Regards,
> Peter Smith.
> Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: Autonomous transactions 2023, WIP
Следующее
От: Bertrand Drouvot
Дата:
Сообщение: Re: Synchronizing slots from primary to standby