RE: [PoC] pg_upgrade: allow to upgrade publisher node

Поиск
Список
Период
Сортировка
От Hayato Kuroda (Fujitsu)
Тема RE: [PoC] pg_upgrade: allow to upgrade publisher node
Дата
Msg-id TYAPR01MB5866AC50DCFCBC989D67E6BDF5F1A@TYAPR01MB5866.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Peter Smith <smithpb2250@gmail.com>)
Ответы Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
Dear Peter,

Thank you for reviewing! PSA new version.

> src/backend/replication/slot.c
> 
> 3. InvalidatePossiblyObsoleteSlot
> 
> + /*
> + * Raise an ERROR if the logical replication slot is invalidating. It
> + * would not happen because max_slot_wal_keep_size is set to -1 during
> + * the upgrade, but it stays safe.
> + */
> + if (*invalidated && SlotIsLogical(s) && IsBinaryUpgrade)
> + elog(ERROR, "Replication slots must not be invalidated during the upgrade.");
> 
> 3a.
> That comment didn't seem good. I think you mean like in the suggestion below.
> 
> SUGGESTION
> It should not be possible for logical replication slots to be
> invalidated because max_slot_wal_keep_size is set to -1 during the
> upgrade. The following is just for sanity-checking.

This part was updated in v35. Please tell me if current version is still bad...

> 3b.
> I wasn't sure if 'max_slot_wal_keep_size' GUC is accessible in this
> scope, but if it is available then maybe
> Assert(max_slot_wal_keep_size_mb == -1); should also be included in
> this sanity check.

IIUC, guc parameters are visible from all the postgres processes.
Added.

> src/bin/pg_upgrade/check.c
> 
> 4. check_new_cluster_logical_replication_slots
> 
> + conn = connectToServer(&new_cluster, "template1");
> +
> + prep_status("Checking for logical replication slots");
> 
> There is some inconsistency with all the subsequent pg_fatals within
> this function -- some of them mention "New cluster" but most of them
> do not.
> 
> Meanwhile, Kuroda-san showed me sample output like:
> 
> Checking for presence of required libraries                   ok
> Checking database user is the install user                    ok
> Checking for prepared transactions                            ok
> Checking for new cluster tablespace directories               ok
> Checking for logical replication slots
> New cluster must not have logical replication slots but found 1 slot.
> Failure, exiting
> 
> So, I felt the log message title ("Checking...") should be changed to
> include the words "new cluster" just like the log preceding it:
> 
> "Checking for logical replication slots" ==> "Checking for new cluster
> logical replication slots"
> 
> Now all the subsequent pg_fatals clearly are for "new cluster"

Changed.

> 5. check_new_cluster_logical_replication_slots
> 
> + if (nslots_on_new)
> + pg_fatal(ngettext("New cluster must not have logical replication
> slots but found %d slot.",
> +   "New cluster must not have logical replication slots but found %d slots.",
> +   nslots_on_new),
> + nslots_on_new);
> 
> 5a.
> TBH, I didn't see why you go to unnecessary trouble to have a plural
> message here. The message could just be like:
> "New cluster must have 0 logical replication slots but found %d."
> 
> ~
> 
> 5b.
> However, now (from the previous review comment #4) if "New cluster" is
> already explicit in the log, the pg_fatal message can become just:
> "New cluster must have ..." ==> "Expected 0 logical replication slots
> but found %d."

Basically it's better. But the initial character should be lower case and period
is not needed. Modified like that.

> 9. get_old_cluster_logical_slot_infos
> 
> + i_slotname = PQfnumber(res, "slot_name");
> + i_plugin = PQfnumber(res, "plugin");
> + i_twophase = PQfnumber(res, "two_phase");
> + i_caught_up = PQfnumber(res, "caught_up");
> + i_invalid = PQfnumber(res, "conflicting");
> 
> IMO SQL should be using an alias for this column, so you can say:
> i_invalid = PQfnumber(res, "invalid")
> 
> which seems better than switching the wording in code.

Modified. The argument of PQfnumber() must be same as the column name, so the
word "as invalid" was added to SQL.

> src/bin/pg_upgrade/pg_upgrade.h
> 
> 10. LogicalSlotInfo
> 
> +typedef struct
> +{
> + char    *slotname; /* slot name */
> + char    *plugin; /* plugin */
> + bool two_phase; /* can the slot decode 2PC? */
> + bool caught_up; /* Is confirmed_flush_lsn the same as latest
> + * checkpoint LSN? */
> + bool invalid; /* Is the slot usable? */
> +} LogicalSlotInfo;
> 
> ~
> 
> + bool invalid; /* Is the slot usable? */
> This field name and comment have opposite meanings. Invalid means NOT usable.
> 
> SUGGESTION
> /* If true, the slot is unusable. */

Fixed.

> src/bin/pg_upgrade/server.c
> 
> 11. start_postmaster
> 
>   * we only modify the new cluster, so only use it there.  If there is a
>   * crash, the new cluster has to be recreated anyway.  fsync=off is a big
>   * win on ext4.
> + *
> + * Also, the max_slot_wal_keep_size is set to -1 to prevent the WAL removal
> + * required by logical slots. The setting could avoid the invalidation of
> + * slots during the upgrade.
>   */
> ~
> 
> IMO this comment "to prevent the WAL removal required by logical
> slots" is ambiguous about how it could be interpreted.  Needs
> rearranging for clarity.

The description was changed. How do you think?

> 12. start_postmaster
> 
>   (cluster == &new_cluster) ?
> - " -c synchronous_commit=off -c fsync=off -c full_page_writes=off" : "",
> + " -c synchronous_commit=off -c fsync=off -c full_page_writes=off -c
> max_slot_wal_keep_size=-1 " :
> + " -c max_slot_wal_keep_size=-1",
> 
> Instead of putting the same option on both sides of the ternary, I was
> wondering if it might be better to hardwire the max_slot_wal_keep_size
> just 1 time in the format string?

Fixed.

> .../pg_upgrade/t/003_logical_replication_slots.pl
> 
> 13.
> # Remove the remained slot
> 
> /remained/remaining/

Fixed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrey Lepikhov
Дата:
Сообщение: Re: Removing unneeded self joins
Следующее
От: "Hayato Kuroda (Fujitsu)"
Дата:
Сообщение: RE: [PoC] pg_upgrade: allow to upgrade publisher node