Re: pg_upgrade and logical replication

Поиск
Список
Период
Сортировка
От vignesh C
Тема Re: pg_upgrade and logical replication
Дата
Msg-id CALDaNm0ogWfN234ybxZvY4+DHJpOAc_Kac51Oc0PQPG0Vh4ZOA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: pg_upgrade and logical replication  (Peter Smith <smithpb2250@gmail.com>)
Ответы Re: pg_upgrade and logical replication  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
On Mon, 6 Nov 2023 at 07:51, Peter Smith <smithpb2250@gmail.com> wrote:
>
> Here are some review comments for patch v11-0001
>
> ======
> Commit message
>
> 1.
> The subscription's replication origin are needed to ensure
> that we don't replicate anything twice.
>
> ~
>
> /are needed/is needed/

Modified

>
> 2.
> Author: Julien Rouhaud
> Reviewed-by: FIXME
> Discussion: https://postgr.es/m/20230217075433.u5mjly4d5cr4hcfe@jrouhaud
>
> ~
>
> Include Vignesh as another author.

Modified

> ======
> doc/src/sgml/ref/pgupgrade.sgml
>
> 3.
> +     <application>pg_upgrade</application> attempts to migrate subscription
> +     dependencies which includes the subscription tables information present in
> +     <link linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>
> +     system table and the subscription replication origin which
> +     will help in continuing logical replication from where the old subscriber
> +     was replicating. This helps in avoiding the need for setting up the
>
> I became a bit lost reading paragraph due to the multiple 'which'...
>
> SUGGESTION
> pg_upgrade attempts to migrate subscription dependencies which
> includes the subscription table information present in
> pg_subscription_rel system
> catalog and also the subscription replication origin. This allows
> logical replication on the new subscriber to continue from where the
> old subscriber was up to.

Modified

> ~~~
>
> 4.
> +     was replicating. This helps in avoiding the need for setting up the
> +     subscription objects manually which requires truncating all the
> +     subscription tables and setting the logical replication slots. Migration
>
> SUGGESTION
> Having the ability to migrate subscription objects avoids the need to
> set them up manually, which would require truncating all the
> subscription tables and setting the logical replication slots.

I have removed this

> ~
>
> TBH, I am wondering what is the purpose of this sentence. It seems
> more like a justification for the patch, but does the user need to
> know all this?
>
> ~~~
>
> 5.
> +      <para>
> +       All the subscription tables in the old subscriber should be in
> +       <literal>i</literal> (initialize), <literal>r</literal> (ready) or
> +       <literal>s</literal> (synchronized). This can be verified by checking
> +       <link
linkend="catalog-pg-subscription-rel">pg_subscription_rel</link>.<structfield>srsubstate</structfield>.
> +      </para>
>
> /should be in/should be in state/

Modified

> ~~~
>
> 6.
> +      <para>
> +       The replication origin entry corresponding to each of the subscriptions
> +       should exist in the old cluster. This can be checking
> +       <link linkend="catalog-pg-subscription">pg_subscription</link> and
> +       <link linkend="catalog-pg-replication-origin">pg_replication_origin</link>
> +       system tables.
> +      </para>
>
> missing words?
>
> /This can be checking/This can be found by checking/

Modified

> ~~~
>
> 7.
> +    <para>
> +     The subscriptions will be migrated to new cluster in disabled state, they
> +     can be enabled after upgrade by following the steps:
> +    </para>
>
> The first bullet also says "Enable the subscription..." so I think
> this paragraph should be worded like the below.
>
> SUGGESTION
> The subscriptions will be migrated to the new cluster in a disabled
> state. After migration, do this:

Modified

> ======
> src/backend/catalog/pg_subscription.c
>
> 8.
>  #include "nodes/makefuncs.h"
> +#include "replication/origin.h"
> +#include "replication/worker_internal.h"
>  #include "storage/lmgr.h"
>
> Why does this change need to be in the patch when there are no other
> code changes in this file?

Modified

> ======
> src/backend/utils/adt/pg_upgrade_support.c
>
> 9. binary_upgrade_create_sub_rel_state
>
> IMO a better name for this function would be
> 'binary_upgrade_add_sub_rel_state' (because it delegates to
> AddSubscriptionRelState).
>
> Then it would obey the same name pattern as the other function
> 'binary_upgrade_replorigin_advance' (which delegates to
> replorigin_advance).

Modified

> ~~~
>
> 10.
> +/*
> + * binary_upgrade_create_sub_rel_state
> + *
> + * Add the relation with the specified relation state to pg_subscription_rel
> + * table.
> + */
> +Datum
> +binary_upgrade_create_sub_rel_state(PG_FUNCTION_ARGS)
> +{
> + Relation rel;
> + HeapTuple tup;
> + Oid subid;
> + Form_pg_subscription form;
> + char    *subname;
> + Oid relid;
> + char relstate;
> + XLogRecPtr sublsn;
>
> 10a.
> /to pg_subscription_rel table./to pg_subscription_rel catalog./

Modified

> ~
>
> 10b.
> Maybe it would be helpful if the function argument were documented
> up-front in the function-comment, or in the variable declarations.
>
> SUGGESTION
> char      *subname;  /* ARG0 = subscription name */
> Oid        relid;    /* ARG1 = relation Oid */
> char       relstate; /* ARG2 = subrel state */
> XLogRecPtr sublsn;   /* ARG3 (optional) = subscription lsn */

I felt the variables are self explainatory in this case and also
consistent with other functions.

> ~~~
>
> 11.
> if (PG_ARGISNULL(3))
> sublsn = InvalidXLogRecPtr;
> else
> sublsn = PG_GETARG_LSN(3);
> FWIW, I'd write that as a one-line ternary assignment allowing all the
> args to be grouped nicely together.
>
> SUGGESTION
> sublsn = PG_ARGISNULL(3) ? InvalidXLogRecPtr : PG_GETARG_LSN(3);

Modified

> ~~~
>
> 12. binary_upgrade_replorigin_advance
>
> /*
>  * binary_upgrade_replorigin_advance
>  *
>  * Update the remote_lsn for the subscriber's replication origin.
>  */
> Datum
> binary_upgrade_replorigin_advance(PG_FUNCTION_ARGS)
> {
> Relation rel;
> HeapTuple tup;
> Oid subid;
> Form_pg_subscription form;
> char    *subname;
> XLogRecPtr sublsn;
> char originname[NAMEDATALEN];
> RepOriginId originid;
> ~
>
> Similar to previous comment #10b. Maybe it would be helpful if the
> function argument were documented up-front in the function-comment, or
> in the variable declarations.
>
> SUGGESTION
> char         originname[NAMEDATALEN];
> RepOriginId  originid;
> char        *subname; /* ARG0 = subscription name */
> XLogRecPtr   sublsn;  /* ARG1 = subscription lsn */

I felt the variables are self explainatory in this case and also
consistent with other functions.

> ~~~
>
> 13.
> + subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
> +
> + if (PG_ARGISNULL(1))
> + sublsn = InvalidXLogRecPtr;
> + else
> + sublsn = PG_GETARG_LSN(1);
>
> Similar to previous comment #11. FWIW, I'd write that as a one-line
> ternary assignment allowing all the args to be grouped nicely
> together.
>
> SUGGESTION
> subname = text_to_cstring(PG_GETARG_TEXT_PP(0));
> sublsn = PG_ARGISNULL(1) ? InvalidXLogRecPtr : PG_GETARG_LSN(1);

Modified

> ======
> src/bin/pg_dump/pg_dump.c
>
> 14. getSubscriptionTables
>
> +/*
> + * getSubscriptionTables
> + *   get information about subscription membership for dumpable tables, this
> + *    will be used only in binary-upgrade mode.
> + */
>
> Should use multiple sentences.
>
> SUGGESTION
> Get information about subscription membership for dumpable tables.
> This will be used only in binary-upgrade mode.

Modified

> ~~~
>
> 15.
> + /* Get subscription relation fields */
> + i_srsubid = PQfnumber(res, "srsubid");
> + i_srrelid = PQfnumber(res, "srrelid");
> + i_srsubstate = PQfnumber(res, "srsubstate");
> + i_srsublsn = PQfnumber(res, "srsublsn");
>
> Might it be better to say "Get pg_subscription_rel attributes"?

Modified

> ~~~
>
> 16. getSubscriptions
>
> + appendPQExpBufferStr(query, "o.remote_lsn\n");
>   appendPQExpBufferStr(query,
>   "FROM pg_subscription s\n"
> + "LEFT JOIN pg_replication_origin_status o \n"
> + "    ON o.external_id = 'pg_' || s.oid::text \n"
>   "WHERE s.subdbid = (SELECT oid FROM pg_database\n"
>   "                   WHERE datname = current_database())");
>
> ~
>
> 16a.
> Should that "remote_lsn" have an alias like "suboriginremotelsn" so
> that it matches the later field assignment better?

Modified

> ~
>
> 16b.
> Probably these catalogs should be qualified using "pg_catalog.".

Modified

> ~~~
>
> 17. dumpSubscriptionTable
>
> +/*
> + * dumpSubscriptionTable
> + *   dump the definition of the given subscription table mapping, this will be
> + *    used only for upgrade operation.
> + */
>
> Make this comment consistent with the other one for getSubscriptionTables:
> - split into multiple sentences
> - use the same terminology "binary-upgrade mode" versus "upgrade operation'.

Modified

> ~~~
>
> 18.
> + /*
> + * binary_upgrade_create_sub_rel_state will add the subscription
> + * relation to pg_subscripion_rel table, this is supported only for
> + * upgrade operation.
> + */
>
> Split into multiple sentences.

Modified

> ======
> src/bin/pg_dump/pg_dump_sort.c
>
> 19.
> + case DO_SUBSCRIPTION_REL:
> + snprintf(buf, bufsize,
> + "SUBSCRIPTION TABLE (ID %d)",
> + obj->dumpId);
> + return;
>
> Should it include the OID (like for DO PUBLICATION_TABLE)?

Modified

> ======
> src/bin/pg_upgrade/check.c
>
> 20.
>   check_for_reg_data_type_usage(&old_cluster);
>   check_for_isn_and_int8_passing_mismatch(&old_cluster);
>
> + check_for_subscription_state(&old_cluster);
> +
>
> There seems no reason anymore for this check to be separated from all
> the other checks. Just remove the blank line.

Modified

> ~~~
>
> 21. check_for_subscription_state
>
> +/*
> + * check_for_subscription_state()
> + *
> + * Verify that each of the subscriptions have all their corresponding tables in
> + * ready state.
> + */
> +static void
> +check_for_subscription_state(ClusterInfo *cluster)
>
> /have/has/
>
> This comment only refers to 'ready' state, but perhaps it is
> misleading (or not entirely correct) because later the SQL is testing
> for more than just the READY state:
>
> + "WHERE srsubstate NOT IN ('i', 's', 'r') "

Modified

> ~~~
>
> 22.
> + res = executeQueryOrDie(conn,
> + "SELECT s.subname, c.relname, n.nspname "
> + "FROM pg_catalog.pg_subscription_rel r "
> + "LEFT JOIN pg_catalog.pg_subscription s"
> + " ON r.srsubid = s.oid "
> + "LEFT JOIN pg_catalog.pg_class c"
> + " ON r.srrelid = c.oid "
> + "LEFT JOIN pg_catalog.pg_namespace n"
> + " ON c.relnamespace = n.oid "
> + "WHERE srsubstate NOT IN ('i', 's', 'r') "
> + "ORDER BY s.subname");
>
> If you are going to check 'i', 's', and 'r' then I thought this
> statement should maybe have some comment about why those states.

Modified

> ~~~
>
> 23.
> + pg_fatal("Your installation contains subscription(s) with\n"
> + "Subscription not having origin and/or subscription relation(s) not
> in ready state.\n"
> + "A list of subscription not having origin and/or\n"
> + "subscription relation(s) not in ready state is in the file: %s",
> + output_path);
>
> 23a.
> This message seems to just be saying the same thing 2 times.
>
> Is also should use newlines and spaces more like the other similar
> pg_patals in this file (e.g. the %s is on next line etc).
>
> SUGGESTION
> Your installation contains subscriptions without origin or having
> relations not in a ready state.\n
> A list of the problem subscriptions is in the file:\n
>     %s

Modified

> ~
>
> 23b.
> Same question about 'not in ready state'. Is that entirely correct?

Modified

> ======
> src/bin/pg_upgrade/t/004_subscription.pl
>
> 24.
> +sub insert_line
> +{
> + my $payload = shift;
> +
> + foreach ("t1", "t2")
> + {
> + $publisher->safe_psql('postgres',
> + "INSERT INTO " . $_ . " (val) VALUES('$payload')");
> + }
> +}
>
> For clarity, maybe call this function 'insert_line_at_pub'

Modified

> ~~~
>
> 25.
> +# ------------------------------------------------------
> +# Check that pg_upgrade is succesful when all tables are in ready state.
> +# ------------------------------------------------------
>
> /succesful/successful/

Modified

> ~~~
>
> 26.
> +command_ok(
> + [
> + 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
> + '-D',         $new_sub->data_dir, '-b', $bindir,
> + '-B',         $bindir,            '-s', $new_sub->host,
> + '-p',         $old_sub->port,     '-P', $new_sub->port,
> + $mode,        '--check',
> + ],
> + 'run of pg_upgrade --check for old instance with invalid remote_lsn');
>
> This is the command for the "success" case. Why is the message part
> referring to "invalid remote_lsn"?

Modified

> ~~~
>
> 27.
> +$publisher->safe_psql('postgres',
> + "CREATE TABLE tab_primary_key(id serial, val text);");
> +$old_sub->safe_psql('postgres',
> + "CREATE TABLE tab_primary_key(id serial PRIMARY KEY, val text);");
> +$publisher->safe_psql('postgres',
>
>
> Maybe it is not necessary, but won't it be better if the publisher
> table also has a primary key (so DDL matches its table name)?

Modified

> ~~~
>
> 28.
> +# Add a row in subscriber so that the table sync will fail.
> +$old_sub->safe_psql('postgres',
> + "INSERT INTO tab_primary_key values(1, 'before initial sync')");
>
> The comment should be slightly more descriptive by saying the reason
> it will fail is that you deliberately inserted the same PK value
> again.

Modified

> ~~~
>
> 29.
> +my $started_query =
> +  "SELECT count(1) = 1 FROM pg_subscription_rel WHERE srsubstate = 'd';";
> +$old_sub->poll_query_until('postgres', $started_query)
> +  or die "Timed out while waiting for subscriber to synchronize data";
>
> Since this cannot synchronize the table data, maybe the message should
> be more like "Timed out while waiting for the table state to become
> 'd' (datasync)"

Modified

> ~~~
>
> 30.
> +command_fails(
> + [
> + 'pg_upgrade', '--no-sync',        '-d', $old_sub->data_dir,
> + '-D',         $new_sub->data_dir, '-b', $bindir,
> + '-B',         $bindir,            '-s', $new_sub->host,
> + '-p',         $old_sub->port,     '-P', $new_sub->port,
> + $mode,        '--check',
> + ],
> + 'run of pg_upgrade --check for old instance with incorrect sub rel');
>
> /with incorrect sub rel/with incorrect sub rel state/ (??)

Modified

> ~~~
>
> 31.
> +# ------------------------------------------------------
> +# Check that pg_upgrade doesn't detect any problem once all the subscription's
> +# relation are in 'r' (ready) state.
> +# ------------------------------------------------------
>
>
> 31a.
> /relation/relations/
>

I have removed this comment

>
> 31b.
> Do you think that comment is correct? All you are doing here is
> allowing the old_sub to proceed because there is no longer any
> conflict -- but isn't that just normal pub/sub behaviour that has
> nothing to do with pg_upgrade?

I have removed this comment

> ~~~
>
> 32.
> +# Stop the old subscriber, insert a row in each table while it's down and add
> +# t2 to the publication
>
> /in each table/in each publisher table/
>
> Also, it is not each table -- it's only t1 and t2; not tab_primary_key.

Modified

> ~~~
>
> 33.
> +  $new_sub->safe_psql('postgres', "SELECT count(*) FROM pg_subscription_rel");
> +is($result, qq(2), "There should be 2 rows in pg_subscription_rel");
>
> /2 rows in pg_subscription_rel/2 rows in pg_subscription_rel
> (representing t1 and tab_primary_key)/

Modified

> ======
>
> 34. binary_upgrade_create_sub_rel_state
>
> +{ oid => '8404', descr => 'for use by pg_upgrade (relation for
> pg_subscription_rel)',
> +  proname => 'binary_upgrade_create_sub_rel_state', proisstrict => 'f',
> +  provolatile => 'v', proparallel => 'u', prorettype => 'void',
> +  proargtypes => 'text oid char pg_lsn',
> +  prosrc => 'binary_upgrade_create_sub_rel_state' },
>
> As mentioned in a previous review comment #9, I felt this function
> should have a different name: binary_upgrade_add_sub_rel_state.

Modified

Thanks for the comments, the attached v12 version patch has the
changes for the same.

Regards,
Vignesh

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Shlok Kyal
Дата:
Сообщение: Re: [PoC] Implementation of distinct in Window Aggregates: take two
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Syncrep and improving latency due to WAL throttling