Re: Synchronizing slots from primary to standby

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: Synchronizing slots from primary to standby
Дата
Msg-id CAHut+PvbbPz1=T4bzY0_GotUK460Eih41Twjt=czJ1z2J8SGEw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Synchronizing slots from primary to standby  (shveta malik <shveta.malik@gmail.com>)
Ответы RE: Synchronizing slots from primary to standby  ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>)
Список pgsql-hackers
Here are some review comments for patch v57-0001.

======
doc/src/sgml/protocol.sgml

1. CREATE_REPLICATION_SLOT ... FAILOVER

+       <varlistentry>
+        <term><literal>FAILOVER [ <replaceable
class="parameter">boolean</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If true, the slot is enabled to be synced to the physical
+          standbys so that logical replication can be resumed after failover.
+         </para>
+        </listitem>
+       </varlistentry>

This syntax says passing the boolean value is optional. So the default
needs to the specified here in the docs (like what the TWO_PHASE
option does).

~~~

2. ALTER_REPLICATION_SLOT ... FAILOVER

+      <variablelist>
+       <varlistentry>
+        <term><literal>FAILOVER [ <replaceable
class="parameter">boolean</replaceable> ]</literal></term>
+        <listitem>
+         <para>
+          If true, the slot is enabled to be synced to the physical
+          standbys so that logical replication can be resumed after failover.
+         </para>
+        </listitem>
+       </varlistentry>
+      </variablelist>

This syntax says passing the boolean value is optional. So it needs to
be specified here in the docs that not passing a value would be the
same as passing the value true.

======
doc/src/sgml/ref/alter_subscription.sgml

3.
+   If <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link>
+   is enabled, you can temporarily disable it in order to execute
these commands.

/in order to/to/

~~~

4.
+     <para>
+      When altering the
+      <link linkend="sql-createsubscription-params-with-slot-name"><literal>slot_name</literal></link>,
+      the <literal>failover</literal> property of the new slot may
differ from the
+      <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link>
+      parameter specified in the subscription. When creating the slot,
+      ensure the slot failover property matches the
+      <link linkend="sql-createsubscription-params-with-failover"><literal>failover</literal></link>
+      parameter value of the subscription.
+     </para>

4a.
the <literal>failover</literal> property of the new slot may differ

Maybe it would be more clear if that said "the failover property value
of the named slot...".

~

4b.
In the "failover property matches" part should that failover also be
rendered as <literal> like before in the same paragraph?

======
doc/src/sgml/system-views.sgml

5.
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>failover</structfield> <type>bool</type>
+      </para>
+      <para>
+       True if this logical slot is enabled to be synced to the physical
+       standbys so that logical replication can be resumed from the new primary
+       after failover. Always false for physical slots.
+      </para></entry>
+     </row>

/True if this logical slot is enabled.../True if this is a logical
slot enabled.../

======
src/backend/commands/subscriptioncmds.c

6. CreateSubscription

+ /*
+ * Even if failover is set, don't create the slot with failover
+ * enabled. Will enable it once all the tables are synced and
+ * ready. The intention is that if failover happens at the time of
+ * table-sync, user should re-launch the subscription instead of
+ * relying on main slot (if synced) with no table-sync data
+ * present. When the subscription has no tables, leave failover as
+ * false to allow ALTER SUBSCRIPTION ... REFRESH PUBLICATION to
+ * work.
+ */
+ if (opts.failover && !opts.copy_data && tables != NIL)
+ failover_enabled = true;

AFAICT it might be possible for this to set failover_enabled = true if
copy_data is false. So failover_enabled would be true when later
calling:
  walrcv_create_slot(wrconn, opts.slot_name, false, twophase_enabled,
    failover_enabled, CRS_NOEXPORT_SNAPSHOT, NULL);

Isn't that contrary to what this comment said: "Even if failover is
set, don't create the slot with failover enabled"

~~~

7. AlterSubscription. case ALTER_SUBSCRIPTION_OPTIONS:

+ if (!sub->slotname)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot set failover for subscription that does not have slot name")));

/for subscription that does not have slot name/for a subscription that
does not have a slot name/

======
.../libpqwalreceiver/libpqwalreceiver.c

8.
+ if (PQresultStatus(res) != PGRES_COMMAND_OK)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROTOCOL_VIOLATION),
+ errmsg("could not alter replication slot \"%s\"",
+ slotname)));

This used to display the error message like
pchomp(PQerrorMessage(conn->streamConn)) but it was removed. Is it OK?

======
src/backend/replication/logical/tablesync.c

9.
+ if (MySubscription->twophasestate == LOGICALREP_TWOPHASE_STATE_PENDING)
+ ereport(LOG,
+ /* translator: %s is a subscription option */
+ (errmsg("logical replication apply worker for subscription \"%s\"
will restart so that %s can be enabled",
+ MySubscription->name, "two_phase")));
+
+ if (MySubscription->failoverstate == LOGICALREP_FAILOVER_STATE_PENDING)
+ ereport(LOG,
+ /* translator: %s is a subscription option */
+ (errmsg("logical replication apply worker for subscription \"%s\"
will restart so that %s can be enabled",
+ MySubscription->name, "failover")));

Those errors have multiple %s, so the translator's comment should say
"the 2nd %s is a..."

~~~

10.
 void
-UpdateTwoPhaseState(Oid suboid, char new_state)
+EnableTwoPhaseFailoverTriState(Oid suboid, bool enable_twophase,
+    bool enable_failover)

I felt the function name was a bit confusing. Maybe it is simpler to
call it like "EnableTriState" or "EnableSubTriState" -- the parameters
anyway specify what actual state(s) will be set.

======
src/backend/replication/logical/worker.c

11.
+ /* Update twophase and/or failover */
+ EnableTwoPhaseFailoverTriState(MySubscription->oid, twophase_pending,
+    failover_pending);
+ if (twophase_pending)
+ MySubscription->twophasestate = LOGICALREP_TWOPHASE_STATE_ENABLED;
+
+ if (failover_pending)
+ MySubscription->failoverstate = LOGICALREP_FAILOVER_STATE_ENABLED;

Can't you pass the MySubscription as a parameter and then the
EnableTwoPhaseFailoverTriState can also set these
LOGICALREP_TWOPHASE_STATE_ENABLED/LOGICALREP_FAILOVER_STATE_ENABLED
states within the Enable* function?

======
src/backend/replication/repl_gram.y

12.
 %token K_CREATE_REPLICATION_SLOT
 %token K_DROP_REPLICATION_SLOT
+%token K_ALTER_REPLICATION_SLOT

and

+ create_replication_slot drop_replication_slot
+ alter_replication_slot identify_system read_replication_slot
+ timeline_history show upload_manifest

and

  | create_replication_slot
  | drop_replication_slot
+ | alter_replication_slot

and

  | K_CREATE_REPLICATION_SLOT { $$ = "create_replication_slot"; }
  | K_DROP_REPLICATION_SLOT { $$ = "drop_replication_slot"; }
+ | K_ALTER_REPLICATION_SLOT { $$ = "alter_replication_slot"; }

etc.

~

Although it makes no difference IMO it is more natural to code
everything in the order: create, alter, drop.

======
src/backend/replication/repl_scanner.l

13.
 CREATE_REPLICATION_SLOT { return K_CREATE_REPLICATION_SLOT; }
 DROP_REPLICATION_SLOT { return K_DROP_REPLICATION_SLOT; }
+ALTER_REPLICATION_SLOT { return K_ALTER_REPLICATION_SLOT; }

and

  case K_CREATE_REPLICATION_SLOT:
  case K_DROP_REPLICATION_SLOT:
+ case K_ALTER_REPLICATION_SLOT:

Although it makes no difference IMO it is more natural to code
everything in the order: create, alter, drop.

======
src/backend/replication/slot.c

14.
+ if (SlotIsPhysical(MyReplicationSlot))
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot use %s with a physical replication slot",
+    "ALTER_REPLICATION_SLOT"));

/with a/for a/

======
src/backend/replication/walsender.c

15.
+static void
+ParseAlterReplSlotOptions(AlterReplicationSlotCmd *cmd, bool *failover)
+{
+ ListCell   *lc;
+ bool failover_given = false;
+
+ /* Parse options */
+ foreach(lc, cmd->options)
+ {
+ DefElem    *defel = (DefElem *) lfirst(lc);

AFAIK there are some new-style macros now you can use for this code.
e.g. foreach_ptr? See [1].

~~~

16.
+ if (strcmp(defel->defname, "failover") == 0)
+ {
+ if (failover_given)
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("conflicting or redundant options")));
+ failover_given = true;
+ *failover = defGetBoolean(defel);
+ }

The documented syntax showed that passing the boolean value for the
FAILOVER option is not mandatory. Does this code work if the boolean
value is not passed?

======
src/bin/psql/tab-complete.c

17.
I think "ALTER SUBSCRIPTION ... SET (failover)" is possible, but the
ALTER SUBSCRIPTION tab completion code is missing.

======
src/include/nodes/replnodes.h

18.
+/* ----------------------
+ * ALTER_REPLICATION_SLOT command
+ * ----------------------
+ */
+typedef struct AlterReplicationSlotCmd
+{
+ NodeTag type;
+ char    *slotname;
+ List    *options;
+} AlterReplicationSlotCmd;
+
+

Same as an earlier comment. Although it makes no difference IMO it is
more natural to define these structs in the order:
CreateReplicationSlotCmd, then AlterReplicationSlotCmd, then
DropReplicationSlotCmd.

======
.../t/050_standby_failover_slots_sync.pl

19.
+
+# Copyright (c) 2023, PostgreSQL Global Development Group
+

/2023/2024/

~~~

20.
+# Create another subscription (using the same slot created above) that enables
+# failover.
+$subscriber1->safe_psql(
+ 'postgres', qq[
+ CREATE TABLE tab_int (a int PRIMARY KEY);
+ CREATE SUBSCRIPTION regress_mysub1 CONNECTION '$publisher_connstr'
PUBLICATION regress_mypub WITH (slot_name = lsub1_slot,
copy_data=false, failover = true, create_slot = false);

The comment should not say "Create another subscription" because this
is the first subscription being created.

/another/a/

~~~

21.
+##################################################
+# Test if changing the failover property of a subscription updates the
+# corresponding failover property of the slot.
+##################################################

/Test if/Test that/

======
src/test/regress/sql/subscription.sql

22.
+CREATE SUBSCRIPTION regress_testsub CONNECTION
'dbname=regress_doesnotexist' PUBLICATION testpub WITH (connect =
false, failover = true);
+
+\dRs+

This is currently only testing the explicit "failover=true".

Maybe you can also test the other kinds work as expected:
- explicit "SET (failover=false)"
- explicit "SET (failover)" with no value specified

======
[1] https://github.com/postgres/postgres/commit/14dd0f27d7cd56ffae9ecdbe324965073d01a9ff

Kind Regards,
Peter Smith.
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Adding facility for injection points (or probe points?) for more advanced tests
Следующее
От: Jeremy Schneider
Дата:
Сообщение: Re: Built-in CTYPE provider