Re: Fix slot synchronization with two_phase decoding enabled
От | Amit Kapila |
---|---|
Тема | Re: Fix slot synchronization with two_phase decoding enabled |
Дата | |
Msg-id | CAA4eK1+Row5XWDbOCTgd4_s=eaqXAL7iXDFQkAinuJFqOTt46A@mail.gmail.com обсуждение исходный текст |
Ответ на | Fix slot synchronization with two_phase decoding enabled ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
Ответы |
Re: Fix slot synchronization with two_phase decoding enabled
|
Список | pgsql-hackers |
On Tue, Mar 25, 2025 at 11:05 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > Hi, > > When testing the slot synchronization with logical replication slots that > enabled two_phase decoding, I found that transactions prepared before two-phase > decoding is enabled may fail to replicate to the subscriber after being > committed on a promoted standby following a failover. > > To reproduce this issue, please follow these steps (also detailed in the > attached TAP test, v1-0001): > > 1. sub: create a subscription with (two_phase = false) > 2. primary (pub): prepare a txn A. > 3. sub: alter subscription set (two_phase = true) and wait for the logical slot to > be synced to standby. > 4. primary (pub): stop primary, promote the standby and let the subscriber use > the promoted standby as publisher. > 5. promoted standby (pub): COMMIT PREPARED A; > 6. sub: the apply worker will report the following ERROR because it didn't > receive the PREPARE. > ERROR: prepared transaction with identifier "pg_gid_16387_752" does not exist > > I think the root cause of this issue is that the two_phase_at field of the > slot, which indicates the LSN from which two-phase decoding is enabled (used to > prevent duplicate data transmission for prepared transactions), is not > synchronized to the standby server. > > In step 3, transaction A is not immediately replicated because it occurred > before enabling two-phase decoding. Thus, the prepared transaction should only > be replicated after decoding the final COMMIT PREPARED, as referenced in > ReorderBufferFinishPrepared(). However, due to the invalid two_phase_at on the > standby, the prepared transaction fails to send at that time. > > This problem arises after the support for altering the two-phase option > (1462aad). > Thanks for the report and patch. I'll look into it. -- With Regards, Amit Kapila.
В списке pgsql-hackers по дате отправления: