Re: Handle infinite recursion in logical replication setup

Поиск
Список
Период
Сортировка
От vignesh C
Тема Re: Handle infinite recursion in logical replication setup
Дата
Msg-id CALDaNm1ei=rRwCBKWtUu8b5OsS6FFcvaxg9h0oXcjgFn8GoZnQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Handle infinite recursion in logical replication setup  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
On Tue, Apr 5, 2022 at 6:21 AM Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are my comments for the latest patch v6-0001.
>
> (I will post my v6-0002 review comments separately)
>
> PATCH v6-0001 comments
> ======================
>
> 1.1 General - Option name
>
> I still feel like the option name is not ideal. Unfortunately, this is
> important because any name change would impact lots of these patch
> files and docs, struct members etc.
>
> It was originally called "local_only", but I thought that as a
> SUBSCRIPTION option that was confusing because "local" means local to
> what? Really it is local to the publisher, not local to the
> subscriber, so that name seemed misleading.
>
> Then I suggested "publish_local_only". Although that resolved the
> ambiguity problem, other people thought it seemed odd to have the
> "publish" prefix for a subscription-side option.
>
> So now it is changed again to "subscribe_local_only" -- It's getting
> better but still, it is implied that the "local" means local to the
> publisher except there is nothing in the option name really to convey
> that meaning. IMO we here all understand the meaning of this option
> mostly by familiarity with this discussion thread, but I think a user
> coming to this for the first time will still be confused by the name.
>
> Below are some other name ideas. Maybe they are not improvements, but
> it might help other people to come up with something better.
>
> subscribe_publocal_only = true/false
> origin = pub_only/any
> adjacent_data_only = true/false
> direct_subscriptions_only = true/false
> ...
>
>
> (FYI, the remainder of these review comments will assume the option is
> still called "subscribe_local_only")

Modified to local_only

> ~~~
>
> 1.2 General - inconsistent members and args
>
> IMO the struct members and args should also be named for close
> consistency with whatever the option name is.
>
> Currently the option is called "subscription_local_only".  So I think
> the members/args would be better to be called "local_only" instead of
> "only_local".

Modified

> ~~~
>
> 1.3 Commit message - wrong option name
>
> The commit message refers to the option name as "publish_local_only"
> instead of the option name that is currently implemented.

Modified

> ~~~
>
> 1.4 Commit message - wording
>
> The wording seems a bit off. Below is suggested simpler wording which
> I AFAIK conveys the same information.
>
> BEFORE
> Add an option publish_local_only which will subscribe only to the locally
> generated data in the publisher node. If subscriber is created with this
> option, publisher will skip publishing the data that was subscribed
> from other nodes. It can be created using following syntax:
> ex: CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=9999'
> PUBLICATION pub1 with (publish_local_only = on);
>
> SUGGESTION
> This patch adds a new SUBSCRIPTION boolean option
> "subscribe_local_only". The default is false. When a SUBSCRIPTION is
> created with this option enabled, the publisher will only publish data
> that originated at the publisher node.
> Usage:
> CREATE SUBSCRIPTION sub1 CONNECTION 'dbname =postgres port=9999'
> PUBLICATION pub1 with (subscribe_local_only = true);

Modified

> ~~~
>
> 1.5 doc/src/sgml/ref/create_subscription.sgml - "generated" changes.
>
> +         <para>
> +          Specifies whether the subscription will request the publisher to send
> +          locally generated changes or both the locally generated changes and
> +          the replicated changes that was generated from other nodes. The
> +          default is <literal>false</literal>.
> +         </para>
>
> For some reason, it seemed a bit strange to me to use the term
> "generated" changes. Maybe better to refer to the origin of changes?
>
> SUGGESTION
> Specifies whether the publisher should send only changes that
> originated locally at the publisher node, or send any publisher node
> changes regardless of their origin. The default is false.

Modified

> ~~~
>
> 1.6 src/backend/replication/pgoutput/pgoutput.c -
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM
>
> @@ -496,6 +509,12 @@ pgoutput_startup(LogicalDecodingContext *ctx,
> OutputPluginOptions *opt,
>   else
>   ctx->twophase_opt_given = true;
>
> + if (data->only_local && data->protocol_version <
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM)
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("requested proto_version=%d does not support
> subscribe_local_only, need %d or higher",
> + data->protocol_version, LOGICALREP_PROTO_TWOPHASE_VERSION_NUM)));
>
> I thought this code should not be using
> LOGICALREP_PROTO_TWOPHASE_VERSION_NUM. Shouldn't there be some newly
> introduced constant like LOGICALREP_PROTO_LOCALONLY_VERSION_NUM which
> you will use here?

Modified, I have set LOGICALREP_PROTO_LOCALONLY_VERSION_NUM to same
value as LOGICALREP_PROTO_TWOPHASE_VERSION_NUM, will increment this
once server version is changed.

> ~~~
>
> 1.7 src/bin/pg_dump/pg_dump.c - 150000
>
> @@ -4451,11 +4452,13 @@ getSubscriptions(Archive *fout)
>   if (fout->remoteVersion >= 150000)
>   appendPQExpBufferStr(query,
>   " s.subtwophasestate,\n"
> - " s.subdisableonerr\n");
> + " s.subdisableonerr,\n"
> + " s.sublocal\n");
>   else
>   appendPQExpBuffer(query,
>     " '%c' AS subtwophasestate,\n"
> -   " false AS subdisableonerr\n",
> +   " false AS subdisableonerr,\n"
> +   " false AS s.sublocal\n",
>     LOGICALREP_TWOPHASE_STATE_DISABLED);
>
> I think this local_only feature is unlikely to get into the PG15
> release, so this code should be split out into a separate condition
> because later will need to change to say >= 160000.

 I have split the condition and checked it with 150000, this will be
changed later to 160000 after the branch is created.

> ~~~
>
> 1.8 src/bin/pg_dump/pg_dump.c - dumpSubscription
>
> @@ -4585,6 +4591,9 @@ dumpSubscription(Archive *fout, const
> SubscriptionInfo *subinfo)
>   if (strcmp(subinfo->subdisableonerr, "t") == 0)
>   appendPQExpBufferStr(query, ", disable_on_error = true");
>
> + if (strcmp(subinfo->sublocal, "f") != 0)
> + appendPQExpBufferStr(query, ", subscribe_local_only = on");
> +
>
> I felt it is more natural to say "if it is true set to true", instead
> of "if it is not false set to on".
>
> SUGGESTION
> if (strcmp(subinfo->sublocal, "t") == 0)
> appendPQExpBufferStr(query, ", subscribe_local_only = true");

Modified

> ~~~
>
> 1.9 src/bin/psql/describe.c - 150000
>
> @@ -6318,9 +6318,11 @@ describeSubscriptions(const char *pattern, bool verbose)
>   if (pset.sversion >= 150000)
>   appendPQExpBuffer(&buf,
>     ", subtwophasestate AS \"%s\"\n"
> -   ", subdisableonerr AS \"%s\"\n",
> +   ", subdisableonerr AS \"%s\"\n"
> +   ", sublocal AS \"%s\"\n",
>     gettext_noop("Two phase commit"),
> -   gettext_noop("Disable on error"));
> +   gettext_noop("Disable on error"),
> +   gettext_noop("Only local"));
>
> I think this local_only feature is unlikely to get into the PG15
> release, so this code should be split out into a separate condition
> because later will need to change to say >= 160000.

I have split the condition and checked it with 150000, this will be
changed later to 160000 after the branch is created.

> ~~
>
> 1.10 src/bin/psql/describe.c - describeSubscriptions column
>
> @@ -6318,9 +6318,11 @@ describeSubscriptions(const char *pattern, bool verbose)
>   if (pset.sversion >= 150000)
>   appendPQExpBuffer(&buf,
>     ", subtwophasestate AS \"%s\"\n"
> -   ", subdisableonerr AS \"%s\"\n",
> +   ", subdisableonerr AS \"%s\"\n"
> +   ", sublocal AS \"%s\"\n",
>     gettext_noop("Two phase commit"),
> -   gettext_noop("Disable on error"));
> +   gettext_noop("Disable on error"),
> +   gettext_noop("Only local"));
>
> I think the column name here should be more consistent with the option
> name. e.g. it should be "Local only", not "Only local".

Modified

> ~~~
>
> 1.11 src/bin/psql/tab-complete.c - whitespace
>
> @@ -3167,7 +3167,7 @@ psql_completion(const char *text, int start, int end)
>   /* Complete "CREATE SUBSCRIPTION <name> ...  WITH ( <opt>" */
>   else if (HeadMatches("CREATE", "SUBSCRIPTION") && TailMatches("WITH", "("))
>   COMPLETE_WITH("binary", "connect", "copy_data", "create_slot",
> -   "enabled", "slot_name", "streaming",
> +   "enabled",  "slot_name", "streaming", "subscribe_local_only",
>
> The patch accidentally added a space char before the "slot_name".

Modified

> ~~~
>
> 1.12 src/include/replication/walreceiver.h - "generated"
>
> @@ -183,6 +183,7 @@ typedef struct
>   bool streaming; /* Streaming of large transactions */
>   bool twophase; /* Streaming of two-phase transactions at
>   * prepare time */
> + bool only_local; /* publish only locally generated data */
>
> This is a similar review comment as #1.5 about saying the word "generated".
> Maybe there is another way to word this?

Modified

> ~~~
>
> 1.13 src/test/regress/sql/subscription.sql - missing test case
>
> Isn't there a missing test case for ensuring that the new option is boolean?

Added test

Thanks for the comments, the attached v7 patch has the changes for the same.

Regards,
Vignesh

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: API stability [was: pgsql: Fix possible recovery trouble if TRUNCATE overlaps a checkpoint.]
Следующее
От: vignesh C
Дата:
Сообщение: Re: Handle infinite recursion in logical replication setup