Re: restarting logical replication after upgrading standby

Поиск
Список
Период
Сортировка
От Vijaykumar Jain
Тема Re: restarting logical replication after upgrading standby
Дата
Msg-id CAM+6J96j4qErattg-DiqRKFE80GFTn6ag7-Q2A7DKQX1UWUOGQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: restarting logical replication after upgrading standby  (Axel Rau <Axel.Rau@chaos1.de>)
Ответы Re: restarting logical replication after upgrading standby  (Vijaykumar Jain <vijaykumarjain.github@gmail.com>)
Re: restarting logical replication after upgrading standby  (Axel Rau <Axel.Rau@chaos1.de>)
Список pgsql-admin


On Sun, 6 Feb 2022 at 17:18, Axel Rau <Axel.Rau@chaos1.de> wrote:
Thanks for your answer, Vijay,

Perhaps I should mention, that logical replication was already running
prior to upgrade.
I just did a shutdown of the server at the subscriber (being upgraded).
I did nothing at the publisher.
The slot just went to inactive.


I just simulated an upgrade of subscriber (mine was 13 to 14 though).
after the upgrade, subscription just stops and is disabled.
looks like it gets disabled in the upgrade.
so i enable it first.
even though subscription is active, the relation which was in subscription was removed from pg_subscription_rel, 
so ofcourse it was not rcving updates from the publisher.
below is the initial walkthrough.



--------- post upgrade of subscriber
postgres=# select * from pg_stat_subscription;
 subid | subname | pid | relid | received_lsn | last_msg_send_time | last_msg_receipt_time | latest_end_lsn | latest_end_time
-------+---------+-----+-------+--------------+--------------------+-----------------------+----------------+-----------------
 16403 | mysub   |     |       |              |                    |                       |                |
(1 row)

postgres=# select * from pg_subscription;
  oid  | subdbid | subname | subowner | subenabled | subbinary | substream | subconninfo | subslotname | subsynccommit | subpublications
-------+---------+---------+----------+------------+-----------+-----------+-------------+-------------+---------------+-----------------
 16403 |   16401 | mysub   |       10 | f          | f         | f         | port=8001   | mysub       | off           | {mypub}
(1 row)

postgres=# select * from pg_subscription_rel;
 srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+----------
(0 rows)

# so i enable subscription ( but still relation not subscribing to changes )
postgres=# alter subscription mysub enable;
ALTER SUBSCRIPTION
postgres=# select * from pg_subscription;
  oid  | subdbid | subname | subowner | subenabled | subbinary | substream | subconninfo | subslotname | subsynccommit | subpublications
-------+---------+---------+----------+------------+-----------+-----------+-------------+-------------+---------------+-----------------
 16403 |   16401 | mysub   |       10 | t          | f         | f         | port=8001   | mysub       | off           | {mypub}
(1 row)

postgres=# select * from pg_stat_subscription;
postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_subscription;
-[ RECORD 1 ]---------+---------------------------------
subid                 | 16403
subname               | mysub
pid                   | 1726
relid                 |
received_lsn          | 0/15E37F0
last_msg_send_time    | 2022-02-07 00:11:31.623328+05:30
last_msg_receipt_time | 2022-02-07 00:11:31.623353+05:30
latest_end_lsn        | 0/15E37F0
latest_end_time       | 2022-02-07 00:11:31.623328+05:30

#but still getting no changes as relation not in subscription 
postgres=#  select * from pg_subscription_rel;
(0 rows)

postgres=# \dt
List of relations
-[ RECORD 1 ]----
Schema | public
Name   | t
Type   | table
Owner  | postgres

# so i try refresh publication, but that is like a re -sync, given the table already has data, it errors due to pkey conflict and aborts.
postgres=# alter subscription mysub refresh publication ;
ALTER SUBSCRIPTION
postgres=#  select * from pg_subscription_rel;
-[ RECORD 1 ]-----
srsubid    | 16403
srrelid    | 16384
srsubstate | d  ---- this is COPY
srsublsn   |

# so finally, i truncate the table and re-sync :(   and everything works, but i think i am missing something here. i am pretty sure we 
# can play with pg_replication_origin_advance or pg_replication_slot_advance to move the lsn to continue subscription without a sync, i'll have to spend some time to understand that
# but for now, in summary, subscription breaks in upgrade. After an upgrade, we need to enable it back and possibly refresh publication (re-sync) from scratch.
# but i am hoping i can be corrected by more experienced people here.

postgres=# truncate table t;
TRUNCATE TABLE
postgres=# alter subscription mysub refresh publication ;
ALTER SUBSCRIPTION
postgres=# table t;
 id
----
  1
  2
  3
(3 rows)

#table re-enabled and is ready
postgres=#  select * from pg_subscription_rel;
 srsubid | srrelid | srsubstate | srsublsn
---------+---------+------------+-----------
   16403 |   16384 | r          | 0/15E3E28
(1 row)



 
Should I have deleted the slot prior to upgrading?

Axel

Am 05.02.22 um 23:41 schrieb Vijaykumar Jain:
> On Sun, 6 Feb 2022 at 02:20, Axel Rau <Axel.Rau@chaos1.de> wrote:
>
>> Am 03.02.22 um 17:06 schrieb Axel Rau:
>>> Replication to another standby is working, but the slot of the upgraded
>>> standby is inactive.
>> Also the subscription has subenabled set to false.
>> Enabling it, turns on the active bit at the publisher side, but does not
>> start replication.
>> Recreating the subscription does not help either.
>>
>> There should be a solution as logical replication is documented to
>> support upgrade of the standby.
>>
>
> i just performed an upgrade using pg_upgrade link mode from pg13 to pg14.
>
> the only additional step as mentioned in the docs was to create the
> logical replication slot manually.
> subscription started working just fine.
>
>
> postgres@controller:/var/tmp/log$ pwd
> /var/tmp/log
> postgres@controller:/var/tmp/log$ /usr/lib/postgresql/14/bin/initdb -D
> /var/tmp/log/testdb
> postgres@controller:/var/tmp/log$ tail postgresql.conf >>
> testdb/postgresql.conf
> postgres@controller:/var/tmp/log$ tail testdb/postgresql.conf
> # Add settings for extensions here
> wal_level=logical
> archive_mode = on
> archive_command = '/bin/true'
> max_wal_size = 48MB
> min_wal_size = 32MB
> shared_buffers = 32MB
> port = 8001
> max_logical_replication_workers = 10
>
> postgres@controller:/var/tmp/log$  /usr/lib/postgresql/14/bin/pg_ctl -D
> testdb -l logfile start
> waiting for server to start.... done
> server started
> postgres@controller:/var/tmp/log$ psql -p 8001
> psql (14.1 (Ubuntu 14.1-2.pgdg20.04+1))
> Type "help" for help.
>
> postgres=# create table t(id int primary key);
> CREATE TABLE
> postgres=# create subscription mysub connection 'port=5001' publication
> mypub;
> NOTICE:  created replication slot "mysub" on publisher
> CREATE SUBSCRIPTION
> postgres=# table t;
>   id
> ----
>    1
>    2
> (2 rows)
>
> *-- after i do an upgrade it stop getting updates from publisher as no slot
> on pg14*
> *postgres=# select * from pg_stat_subscription;*
> * subid | subname | pid | relid | received_lsn | last_msg_send_time |
> last_msg_receipt_time | latest_end_lsn | latest_end_time*
> *-------+---------+-----+-------+--------------+--------------------+-----------------------+----------------+-----------------*
> * 16389 | mysub   |     |       |              |                    |
>                   |                |*
> *(1 row)*
>
> *-- but when i create the logical replication slot on pg14 (see below), it
> starts running just fine*
> *postgres=# select * from pg_stat_subscription;*
> *postgres=# \x*
> *Expanded display is on.*
> *postgres=# select * from pg_stat_subscription;*
> *-[ RECORD 1 ]---------+---------------------------------*
> *subid                 | 16389*
> *subname               | mysub*
> *pid                   | 6099*
> *relid                 |*
> *received_lsn          | 0/C1BC8E0*
> *last_msg_send_time    | 2022-02-06 03:59:07.272526+05:30*
> *last_msg_receipt_time | 2022-02-06 03:59:07.272627+05:30*
> *latest_end_lsn        | 0/C1BC8E0*
> *latest_end_time       | 2022-02-06 03:59:07.272526+05:30*
>
> *postgres=# table t;*
> *-[ RECORD 1 ]*
> *id | 1*
> *-[ RECORD 2 ]*
> *id | 2*
> *-[ RECORD 3 ]*
> *id | 3*
>
> -------------------------------------------------------------------------------------------------
> upgrade from pg13 to pg14
> postgres@controller:~/temp/13$ tail postgresql.conf
> # Add settings for extensions here
> wal_level=logical
> archive_mode = on
> archive_command = '/bin/true'
> max_wal_size = 48MB
> min_wal_size = 32MB
> shared_buffers = 32MB
> port = 5001
> max_logical_replication_workers = 10
>
> postgres@controller:~/temp/13$ initdb -D pg13
> postgres@controller:~/temp/13$ cp postgresql.conf pg13/postgresql.conf
> postgres@controller:~/temp/13$ pg_ctl -D pg13 -l logfile start
> waiting for server to start.... done
> server started
> postgres@controller:~/temp/13$ psql -p 5001
> psql (13.5 (Ubuntu 13.5-2.pgdg20.04+1))
> Type "help" for help.
>
> postgres=# create table t(id int primary key);
> CREATE TABLE
> postgres=# insert into t values (1);
> INSERT 0 1
> postgres=# insert into t values (2);
> INSERT 0 1
> postgres=# checkpoint;
> CHECKPOINT
> postgres=# create publication mypub for table t;
> CREATE PUBLICATION
> postgres=# \x
> Expanded display is on.
> postgres=# select * from pg_replication_slots;  -- note the details as we
> need to create on pg14 after upgrade
> -[ RECORD 1 ]-------+----------
> slot_name           | mysub
> plugin              | pgoutput
> slot_type           | logical
> datoid              | 13414
> database            | postgres
> temporary           | f
> active              | t
> active_pid          | 5332
> xmin                |
> catalog_xmin        | 489
> restart_lsn         | 0/15E7098
> confirmed_flush_lsn | 0/15E70D0
> wal_status          | reserved
> safe_wal_size       |
>
> postgres=# \q
> postgres@controller:~/temp/13$  pg_ctl -D pg13 -l logfile stop
> waiting for server to shut down.... done
> server stopped
>
>
> ---------------------------------------
> pg14
>
> postgres@controller:~/temp/14$ initdb -D /var/lib/postgresql/temp/14/pg14
> #perform upgrade
> postgres@controller:~/temp/14$
> /usr/lib/postgresql/${NEW_PSQL}/bin/pg_upgrade -b
> /usr/lib/postgresql/${OLD_PSQL}/bin/ -B
> /usr/lib/postgresql/${NEW_PSQL}/bin/  -d
> /var/lib/postgresql/temp/${OLD_PSQL}/pg${OLD_PSQL} -D
> /var/lib/postgresql/temp//${NEW_PSQL}/pg${NEW_PSQL} -o "-c
> config-file=/var/lib/postgresql/temp/${OLD_PSQL}/pg${OLD_PSQL}/postgresql.conf"
> -O "-c
> config-file=/var/lib/postgresql/temp/${NEW_PSQL}/pg${NEW_PSQL}/postgresql.conf"
> --link
> Performing Consistency Checks
> -----------------------------
> Checking cluster versions                                   ok
> Checking database user is the install user                  ok
> Checking database connection settings                       ok
> Checking for prepared transactions                          ok
> Checking for system-defined composite types in user tables  ok
> Checking for reg* data types in user tables                 ok
> Checking for contrib/isn with bigint-passing mismatch       ok
> Checking for user-defined encoding conversions              ok
> Checking for user-defined postfix operators                 ok
> Creating dump of global objects                             ok
> Creating dump of database schemas
>                                                              ok
> Checking for presence of required libraries                 ok
> Checking database user is the install user                  ok
> Checking for prepared transactions                          ok
> Checking for new cluster tablespace directories             ok
>
> If pg_upgrade fails after this point, you must re-initdb the
> new cluster before continuing.
>
> Performing Upgrade
> ------------------
> Analyzing all rows in the new cluster                       ok
> Freezing all rows in the new cluster                        ok
> Deleting files from new pg_xact                             ok
> Copying old pg_xact to new server                           ok
> Setting oldest XID for new cluster                          ok
> Setting next transaction ID and epoch for new cluster       ok
> Deleting files from new pg_multixact/offsets                ok
> Copying old pg_multixact/offsets to new server              ok
> Deleting files from new pg_multixact/members                ok
> Copying old pg_multixact/members to new server              ok
> Setting next multixact ID and offset for new cluster        ok
> Resetting WAL archives                                      ok
> Setting frozenxid and minmxid counters in new cluster       ok
> Restoring global objects in the new cluster                 ok
> Restoring database schemas in the new cluster
>                                                              ok
> Adding ".old" suffix to old global/pg_control               ok
>
> If you want to start the old cluster, you will need to remove
> the ".old" suffix from
> /var/lib/postgresql/temp/13/pg13/global/pg_control.old.
> Because "link" mode was used, the old cluster cannot be safely
> started once the new cluster has been started.
>
> Linking user relation files
>                                                              ok
> Setting next OID for new cluster                            ok
> Sync data directory to disk                                 ok
> Creating script to delete old cluster                       ok
> Checking for extension updates                              ok
>
> Upgrade Complete
> ----------------
> Optimizer statistics are not transferred by pg_upgrade.
> Once you start the new server, consider running:
>      /usr/lib/postgresql/14/bin/vacuumdb --all --analyze-in-stages
>
> Running this script will delete the old cluster's data files:
>      ./delete_old_cluster.sh
>
> #copy the conf file
> postgres@controller:~/temp/14$ cp postgresql.conf pg14/
> postgres@controller:~/temp/14$ pg_ctl -D pg14 -l logfile start
> waiting for server to start.... done
> server started
>
> postgres@controller:~/temp/14$  /usr/lib/postgresql/14/bin/vacuumdb -p 5001
> --all --analyze-in-stages
> vacuumdb: processing database "postgres": Generating minimal optimizer
> statistics (1 target)
> vacuumdb: processing database "template1": Generating minimal optimizer
> statistics (1 target)
> vacuumdb: processing database "postgres": Generating medium optimizer
> statistics (10 targets)
> vacuumdb: processing database "template1": Generating medium optimizer
> statistics (10 targets)
> vacuumdb: processing database "postgres": Generating default (full)
> optimizer statistics
> vacuumdb: processing database "template1": Generating default (full)
> optimizer statistics
> postgres@controller:~/temp/14$ psql -p 5001
>
>
> ### this is the only post action after upgrade i had to do (since you would
> use ansible, you can automate this my creating slots manually or via if
> exists and not via create subscription
> psql (14.1 (Ubuntu 14.1-2.pgdg20.04+1))
> Type "help" for help.
>
> postgres=# SELECT * FROM pg_create_logical_replication_slot('mysub',
> 'pgoutput', false, true);
>   slot_name |    lsn
> -----------+-----------
>   mysub     | 0/C19E358
> (1 row)
>
> postgres=# select * from pg_replication_slots;
> postgres=# \x
> Expanded display is on.
> postgres=# select * from pg_replication_slots;
> -[ RECORD 1 ]-------+----------
> slot_name           | mysub
> plugin              | pgoutput
> slot_type           | logical
> datoid              | 16401
> database            | postgres
> temporary           | f
> active              | t
> active_pid          | 6100
> xmin                |
> catalog_xmin        | 760
> restart_lsn         | 0/C19E320
> confirmed_flush_lsn | 0/C19E358
> wal_status          | reserved
> safe_wal_size       |
> two_phase           | t
>
>
> --- the below changes get pushed to subscriber just fine. i did not create
> publication.
> postgres=# insert into t values (3);
> INSERT 0 1
> postgres=# \q
> postgres@controller:~/temp/14$ pg_ctl -D pg14  -l logfile stop
> waiting for server to shut down.... done
> server stopped
>
>


--
PGP-Key: CDE74120  ☀  computing @ chaos claudius


--

В списке pgsql-admin по дате отправления:

Предыдущее
От: "David G. Johnston"
Дата:
Сообщение: Re: regarding remote connection
Следующее
От: Vijaykumar Jain
Дата:
Сообщение: Re: restarting logical replication after upgrading standby