RE: Synchronizing slots from primary to standby
От | Zhijie Hou (Fujitsu) |
---|---|
Тема | RE: Synchronizing slots from primary to standby |
Дата | |
Msg-id | OS0PR01MB57164C4C576A2BB8FEB61D6F94AAA@OS0PR01MB5716.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Synchronizing slots from primary to standby (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby Re: Synchronizing slots from primary to standby |
Список | pgsql-hackers |
On Friday, November 3, 2023 7:32 PM Amit Kapila <amit.kapila16@gmail.com> > > On Thu, Nov 2, 2023 at 2:35 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> > wrote: > > > > Here is the new version patch set(V29) which addressed Peter > > comments[1][2] and fixed one doc compile error. > > > > Few comments: > ============== > 1. > + <varlistentry id="sql-createsubscription-params-with-failover"> > + <term><literal>failover</literal> (<type>boolean</type>)</term> > + <listitem> > + <para> > + Specifies whether the replication slot assocaited with the > subscription > + is enabled to be synced to the physical standbys so that logical > + replication can be resumed from the new primary after failover. > + The default is <literal>true</literal>. > > Why do you think it is a good idea to keep the default value as true? > I think the user needs to enable standby for syncing slots which is not a default > feature, so by default, the failover property should also be false. AFAICS, it is > false for create_slot SQL API as per the below change; so that way also keeping > default true for a subscription doesn't make sense. > @@ -479,6 +479,7 @@ CREATE OR REPLACE FUNCTION > pg_create_logical_replication_slot( > IN slot_name name, IN plugin name, > IN temporary boolean DEFAULT false, > IN twophase boolean DEFAULT false, > + IN failover boolean DEFAULT false, > OUT slot_name name, OUT lsn pg_lsn) > > BTW, the below change indicates that the code treats default as false; so, it > seems to be a documentation error. I think the document is wrong and fixed it. > > 2. > - > /* > * Common option parsing function for CREATE and ALTER SUBSCRIPTION > commands. > * > > Spurious line removal. > > 3. > + else if (opts.slot_name && failover_enabled) { > + walrcv_alter_slot(wrconn, opts.slot_name, opts.failover); > + ereport(NOTICE, (errmsg("altered replication slot \"%s\" on > + publisher", opts.slot_name))); } > > I think we can add a comment to describe why it makes sense to enable the > failover property of the slot in this case. Can we change the notice message to: > "enabled failover for replication slot \"%s\" on publisher" Added. > > 4. > libpqrcv_create_slot(WalReceiverConn *conn, const char *slotname, > - bool temporary, bool two_phase, CRSSnapshotAction snapshot_action, > - XLogRecPtr *lsn) > + bool temporary, bool two_phase, bool failover, CRSSnapshotAction > + snapshot_action, XLogRecPtr *lsn) > { > PGresult *res; > StringInfoData cmd; > @@ -913,7 +917,14 @@ libpqrcv_create_slot(WalReceiverConn *conn, const > char *slotname, > else > appendStringInfoChar(&cmd, ' '); > } > - > + if (failover) > + { > + appendStringInfoString(&cmd, "FAILOVER"); if (use_new_options_syntax) > + appendStringInfoString(&cmd, ", "); else appendStringInfoChar(&cmd, ' > + '); } > > I don't see a corresponding change in repl_gram.y. I think the following part of > the code needs to be changed: > /* CREATE_REPLICATION_SLOT slot [TEMPORARY] LOGICAL plugin [options] */ > | K_CREATE_REPLICATION_SLOT IDENT opt_temporary K_LOGICAL IDENT > create_slot_options > I think after 0266e98, we started to use the new syntax(see the generic_option_list rule) and we can avoid changing the repl_gram.y when adding new options. The new failover can be detected when parsing the generic option list(in parseCreateReplSlotOptions). > > 5. > @@ -228,6 +230,28 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo > fcinfo, bool confirm, bool bin > NameStr(MyReplicationSlot->data.plugin), > format_procedure(fcinfo->flinfo->fn_oid)))); > .. > + if (XLogRecPtrIsInvalid(upto_lsn)) > + wal_to_wait = end_of_wal; > + else > + wal_to_wait = Min(upto_lsn, end_of_wal); > + > + /* Initialize standby_slot_names_list */ SlotSyncInitConfig(); > + > + /* > + * Wait for specified streaming replication standby servers (if any) > + * to confirm receipt of WAL upto wal_to_wait. > + */ > + WalSndWaitForStandbyConfirmation(wal_to_wait); > + > + /* > + * The memory context used to allocate standby_slot_names_list will be > + * freed at the end of this call. So free and nullify the list in > + * order to avoid usage of freed list in the next call to this > + * function. > + */ > + SlotSyncFreeConfig(); > > What if there is an error in WalSndWaitForStandbyConfirmation() before calling > SlotSyncFreeConfig()? I think the problem you are trying to avoid by freeing it > here can occur. I think it is better to do this in a logical decoding context and > free the list along with it as we are doing in commit c7256e6564(see PG15). I will analyze more about this case and update in next version. > Also, > it is better to allocate this list somewhere in > WalSndWaitForStandbyConfirmation(), probably in WalSndGetStandbySlots, > that will make the code look neat and also avoid allocating this list when > failover is not enabled for the slot. Changed as suggested. > > 6. > +/* ALTER_REPLICATION_SLOT slot */ > +alter_replication_slot: > + K_ALTER_REPLICATION_SLOT IDENT '(' generic_option_list ')' > > I think you need to update the docs for this new command. See existing docs > [1]. > > [1] - https://www.postgresql.org/docs/devel/protocol-replication.html I think the doc for alter_replication_slot was added in V29. Attach the V30 patch set which addressed above comments and fixed CFbot failures. Best Regards, Hou zj
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: jian heДата:
Сообщение: Re: pg_column_toast_chunk_id: a function to get a chunk ID of a TOASTed value