RE: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От Zhijie Hou (Fujitsu)
Тема RE: Synchronizing slots from primary to standby
Дата
Msg-id OS0PR01MB57164C4C576A2BB8FEB61D6F94AAA@OS0PR01MB5716.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby
Список pgsql-hackers
On Friday, November 3, 2023 7:32 PM Amit Kapila <amit.kapila16@gmail.com>
> 
> On Thu, Nov 2, 2023 at 2:35 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com>
> wrote:
> >
> > Here is the new version patch set(V29) which addressed Peter
> > comments[1][2] and fixed one doc compile error.
> >
> 
> Few comments:
> ==============
> 1.
> +       <varlistentry id="sql-createsubscription-params-with-failover">
> +        <term><literal>failover</literal> (<type>boolean</type>)</term>
> +        <listitem>
> +         <para>
> +          Specifies whether the replication slot assocaited with the
> subscription
> +          is enabled to be synced to the physical standbys so that logical
> +          replication can be resumed from the new primary after failover.
> +          The default is <literal>true</literal>.
> 
> Why do you think it is a good idea to keep the default value as true?
> I think the user needs to enable standby for syncing slots which is not a default
> feature, so by default, the failover property should also be false. AFAICS, it is
> false for create_slot SQL API as per the below change; so that way also keeping
> default true for a subscription doesn't make sense.
> @@ -479,6 +479,7 @@ CREATE OR REPLACE FUNCTION
> pg_create_logical_replication_slot(
>      IN slot_name name, IN plugin name,
>      IN temporary boolean DEFAULT false,
>      IN twophase boolean DEFAULT false,
> +    IN failover boolean DEFAULT false,
>      OUT slot_name name, OUT lsn pg_lsn)
> 
> BTW, the below change indicates that the code treats default as false; so, it
> seems to be a documentation error.

I think the document is wrong and fixed it.

> 
> 2.
> -
>  /*
>   * Common option parsing function for CREATE and ALTER SUBSCRIPTION
> commands.
>   *
> 
> Spurious line removal.
> 
> 3.
> + else if (opts.slot_name && failover_enabled) {
> + walrcv_alter_slot(wrconn, opts.slot_name, opts.failover);
> + ereport(NOTICE, (errmsg("altered replication slot \"%s\" on
> + publisher", opts.slot_name))); }
> 
> I think we can add a comment to describe why it makes sense to enable the
> failover property of the slot in this case. Can we change the notice message to:
> "enabled failover for replication slot \"%s\" on publisher"

Added.

> 
> 4.
>  libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname,
> - bool temporary, bool two_phase, CRSSnapshotAction snapshot_action,
> - XLogRecPtr *lsn)
> + bool temporary, bool two_phase, bool failover, CRSSnapshotAction
> + snapshot_action, XLogRecPtr *lsn)
>  {
>   PGresult   *res;
>   StringInfoData cmd;
> @@ -913,7 +917,14 @@ libpqrcv_create_slot(WalReceiverConn *conn, const
> char *slotname,
>   else
>   appendStringInfoChar(&cmd, ' ');
>   }
> -
> + if (failover)
> + {
> + appendStringInfoString(&cmd, "FAILOVER"); if (use_new_options_syntax)
> + appendStringInfoString(&cmd, ", "); else appendStringInfoChar(&cmd, '
> + '); }
> 
> I don't see a corresponding change in repl_gram.y. I think the following part of
> the code needs to be changed:
> /* CREATE_REPLICATION_SLOT slot [TEMPORARY] LOGICAL plugin [options] */
> | K_CREATE_REPLICATION_SLOT IDENT opt_temporary K_LOGICAL IDENT
> create_slot_options
> 

I think after 0266e98, we started to use the new syntax(see the
generic_option_list rule) and we can avoid changing the repl_gram.y when adding
new options. The new failover can be detected when parsing the generic option
list(in parseCreateReplSlotOptions).


> 
> 5.
> @@ -228,6 +230,28 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo
> fcinfo, bool confirm, bool bin
>   NameStr(MyReplicationSlot->data.plugin),
>   format_procedure(fcinfo->flinfo->fn_oid))));
> ..
> + if (XLogRecPtrIsInvalid(upto_lsn))
> + wal_to_wait = end_of_wal;
> + else
> + wal_to_wait = Min(upto_lsn, end_of_wal);
> +
> + /* Initialize standby_slot_names_list */ SlotSyncInitConfig();
> +
> + /*
> + * Wait for specified streaming replication standby servers (if any)
> + * to confirm receipt of WAL upto wal_to_wait.
> + */
> + WalSndWaitForStandbyConfirmation(wal_to_wait);
> +
> + /*
> + * The memory context used to allocate standby_slot_names_list will be
> + * freed at the end of this call. So free and nullify the list in
> + * order to avoid usage of freed list in the next call to this
> + * function.
> + */
> + SlotSyncFreeConfig();
> 
> What if there is an error in WalSndWaitForStandbyConfirmation() before calling
> SlotSyncFreeConfig()? I think the problem you are trying to avoid by freeing it
> here can occur. I think it is better to do this in a logical decoding context and
> free the list along with it as we are doing in commit c7256e6564(see PG15).

I will analyze more about this case and update in next version.

> Also,
> it is better to allocate this list somewhere in
> WalSndWaitForStandbyConfirmation(), probably in WalSndGetStandbySlots,
> that will make the code look neat and also avoid allocating this list when
> failover is not enabled for the slot.

Changed as suggested.


> 
> 6.
> +/* ALTER_REPLICATION_SLOT slot */
> +alter_replication_slot:
> + K_ALTER_REPLICATION_SLOT IDENT '(' generic_option_list ')'
> 
> I think you need to update the docs for this new command. See existing docs
> [1].
> 
> [1] - https://www.postgresql.org/docs/devel/protocol-replication.html

I think the doc for alter_replication_slot was added in V29.

Attach the V30 patch set which addressed above comments and fixed CFbot failures.

Best Regards,
Hou zj


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: jian he
Дата:
Сообщение: Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value
Следующее
От: Peter Smith
Дата:
Сообщение: Re: pg_upgrade and logical replication