Re: persist logical slots to disk during shutdown checkpoint
От | vignesh C |
---|---|
Тема | Re: persist logical slots to disk during shutdown checkpoint |
Дата | |
Msg-id | CALDaNm3jY-xiDhEoJwTYHiBWiY+4DpNjZbddNozE7ZyOXKLvFg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: persist logical slots to disk during shutdown checkpoint (vignesh C <vignesh21@gmail.com>) |
Список | pgsql-hackers |
On Fri, 25 Aug 2023 at 17:40, vignesh C <vignesh21@gmail.com> wrote: > > On Sat, 19 Aug 2023 at 11:53, Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > It's entirely possible for a logical slot to have a confirmed_flush > > LSN higher than the last value saved on disk while not being marked as > > dirty. It's currently not a problem to lose that value during a clean > > shutdown / restart cycle but to support the upgrade of logical slots > > [1] (see latest patch at [2]), we seem to rely on that value being > > properly persisted to disk. During the upgrade, we need to verify that > > all the data prior to shudown_checkpoint for the logical slots has > > been consumed, otherwise, the downstream may miss some data. Now, to > > ensure the same, we are planning to compare the confirm_flush LSN > > location with the latest shudown_checkpoint location which means that > > the confirm_flush LSN should be updated after restart. > > > > I think this is inefficient even without an upgrade because, after the > > restart, this may lead to decoding some data again. Say, we process > > some transactions for which we didn't send anything downstream (the > > changes got filtered) but the confirm_flush LSN is updated due to > > keepalives. As we don't flush the latest value of confirm_flush LSN, > > it may lead to processing the same changes again. > > I was able to test and verify that we were not processing the same > changes again. > Note: The 0001-Add-logs-to-skip-transaction-filter-insert-operation.patch > has logs to print if a decode transaction is skipped and also a log to > mention if any operation is filtered. > The test.sh script has the steps for a) setting up logical replication > for a table b) perform insert on table that need to be published (this > will be replicated to the subscriber) c) perform insert on a table > that will not be published (this insert will be filtered, it will not > be replicated) d) sleep for 5 seconds e) stop the server f) start the > server > I used the following steps, do the following in HEAD: > a) Apply 0001-Add-logs-to-skip-transaction-filter-insert-operation.patch > patch in Head and build the binaries b) execute test.sh c) view N1.log > file to see that the insert operations were filtered again by seeing > the following logs: > LOG: Filter insert for table tbl2 > ... > ===restart=== > ... > LOG: Skipping transaction 0/156AD10 as start decode at is greater 0/156AE40 > ... > LOG: Filter insert for table tbl2 > > We can see that the insert operations on tbl2 which was filtered > before server was stopped is again filtered after restart too in HEAD. > > Lets see that the same changes were not processed again with patch: > a) Apply v4-0001-Persist-to-disk-logical-slots-during-a-shutdown-c.patch > from [1] also apply > 0001-Add-logs-to-skip-transaction-filter-insert-operation.patch patch > and build the binaries b) execute test.sh c) view N1.log file to see > that the insert operations were skipped after restart of server by > seeing the following logs: > LOG: Filter insert for table tbl2 > ... > ===restart=== > ... > Skipping transaction 0/156AD10 as start decode at is greater 0/156AFB0 > ... > Skipping transaction 0/156AE80 as start decode at is greater 0/156AFB0 > > We can see that the insert operations on tbl2 are not processed again > after restart with the patch. Here is another way to test using pg_replslotdata approach that was proposed earlier at [1]. I have rebased this on top of HEAD and the v5 version for the same is attached. We can use the same test as test.sh shared at [2]. When executed with HEAD, it was noticed that confirmed_flush points to WAL location before both the transaction: slot_name slot_type datoid persistency xmin catalog_xmin restart_lsn confirmed_flush two_phase_at two_phase plugin --------- --------- ------ ---------- ---- ----------- ----------- --------------- ------------ --------- ------ sub logical 5 persistent 0 735 0/1531E28 0/1531E60 0/0 0 pgoutput WAL record information generated using pg_walinspect for various records at and after confirmed_flush WAL 0/1531E60: row_number | start_lsn | end_lsn | prev_lsn | xid | resource_manager | record_type | record_length | main_data_length | fpi_length | description | block_ref ------------+-----------+-----------+-----------+-----+------------------+---------------------+---------------+------------------+------------+------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------- 1 | 0/1531E60 | 0/1531EA0 | 0/1531E28 | 0 | Heap2 | PRUNE | 57 | 9 | 0 | snapshotConflictHorizon: 0, nredirected: 0, ndead: 1, nunused: 0, redirected: [], dead: [1], unused: [] | blkref #0: rel 1663/5/1255 fork main blk 58 2 | 0/1531EA0 | 0/1531EE0 | 0/1531E60 | 735 | Heap | INSERT+INIT | 59 | 3 | 0 | off: 1, flags: 0x08 | blkref #0: rel 1663/5/16384 fork main blk 0 3 | 0/1531EE0 | 0/1531F20 | 0/1531EA0 | 735 | Heap | INSERT | 59 | 3 | 0 | off: 2, flags: 0x08 | blkref #0: rel 1663/5/16384 fork main blk 0 4 | 0/1531F20 | 0/1531F60 | 0/1531EE0 | 735 | Heap | INSERT | 59 | 3 | 0 | off: 3, flags: 0x08 | blkref #0: rel 1663/5/16384 fork main blk 0 5 | 0/1531F60 | 0/1531FA0 | 0/1531F20 | 735 | Heap | INSERT | 59 | 3 | 0 | off: 4, flags: 0x08 | blkref #0: rel 1663/5/16384 fork main blk 0 6 | 0/1531FA0 | 0/1531FE0 | 0/1531F60 | 735 | Heap | INSERT | 59 | 3 | 0 | off: 5, flags: 0x08 | blkref #0: rel 1663/5/16384 fork main blk 0 7 | 0/1531FE0 | 0/1532028 | 0/1531FA0 | 735 | Transaction | COMMIT | 46 | 20 | 0 | 2023-08-27 23:22:17.161215+05:30 | 8 | 0/1532028 | 0/1532068 | 0/1531FE0 | 736 | Heap | INSERT+INIT | 59 | 3 | 0 | off: 1, flags: 0x08 | blkref #0: rel 1663/5/16387 fork main blk 0 9 | 0/1532068 | 0/15320A8 | 0/1532028 | 736 | Heap | INSERT | 59 | 3 | 0 | off: 2, flags: 0x08 | blkref #0: rel 1663/5/16387 fork main blk 0 10 | 0/15320A8 | 0/15320E8 | 0/1532068 | 736 | Heap | INSERT | 59 | 3 | 0 | off: 3, flags: 0x08 | blkref #0: rel 1663/5/16387 fork main blk 0 11 | 0/15320E8 | 0/1532128 | 0/15320A8 | 736 | Heap | INSERT | 59 | 3 | 0 | off: 4, flags: 0x08 | blkref #0: rel 1663/5/16387 fork main blk 0 12 | 0/1532128 | 0/1532168 | 0/15320E8 | 736 | Heap | INSERT | 59 | 3 | 0 | off: 5, flags: 0x08 | blkref #0: rel 1663/5/16387 fork main blk 0 13 | 0/1532168 | 0/1532198 | 0/1532128 | 736 | Transaction | COMMIT | 46 | 20 | 0 | 2023-08-27 23:22:17.174756+05:30 | 14 | 0/1532198 | 0/1532210 | 0/1532168 | 0 | XLOG | CHECKPOINT_SHUTDOWN | 114 | 88 | 0 | redo 0/1532198; tli 1; prev tli 1; fpw true; xid 0:737; oid 16399; multi 1; offset 0; oldest xid 723 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown | Whereas the same test executed with the patch applied shows that confirmed_flush points to CHECKPOINT_SHUTDOWN record: slot_name slot_type datoid persistency xmin catalog_xmin restart_lsn confirmed_flush two_phase_at two_phase plugin --------- --------- ------ ----------- --- ----------- ----------- --------------- ----------- --------- ------ sub logical 5 persistent 0 735 0/1531E28 0/1532198 0/0 0 pgoutput WAL record information generated using pg_walinspect for various records at and after confirmed_flush WAL 0/1532198: row_number | start_lsn | end_lsn | prev_lsn | xid | resource_manager | record_type | record_length | main_data_length | fpi_length | description | block_ref ------------+-----------+-----------+-----------+-----+------------------+---------------------+---------------+------------------+------------+------------------------------------------------------------------- --------------------------------------------------------------------------------------------------------------------------------------------+----------- 1 | 0/1532198 | 0/1532210 | 0/1532168 | 0 | XLOG | CHECKPOINT_SHUTDOWN | 114 | 88 | 0 | redo 0/1532198; tli 1; prev tli 1; fpw true; xid 0:737; oid 16399; multi 1; offset 0; oldest xid 723 in DB 1; oldest multi 1 in DB 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0; shutdown | (1 row) [1] - https://www.postgresql.org/message-id/flat/CALj2ACW0rV5gWK8A3m6_X62qH%2BVfaq5hznC%3Di0R5Wojt5%2Byhyw%40mail.gmail.com [2] - https://www.postgresql.org/message-id/CALDaNm2BboFuFVYxyzP4wkv7%3D8%2B_TwsD%2BugyGhtibTSF4m4XRg%40mail.gmail.com Regards, Vignesh
Вложения
В списке pgsql-hackers по дате отправления: