Re: Got FATAL in lock_twophase_recover() during recovery
| От | Michael Paquier | 
|---|---|
| Тема | Re: Got FATAL in lock_twophase_recover() during recovery | 
| Дата | |
| Msg-id | ZK3oueit2kXFZujm@paquier.xyz обсуждение исходный текст | 
| Список | pgsql-bugs | 
On Tue, Jul 11, 2023 at 10:35:15AM +0800, suyu.cmj wrote: > I want to report a bug about the recovery of two-phase transaction, > in current implementation of crash recovery, there are two ways to > recover 2pc data: > 1、before redo, func restoreTwoPhaseData() will restore 2pc data > those xid < ShmemVariableCache->nextXid, which is initialized from > checkPoint.nextXid; > 2、during redo, func xact_redo() will add 2pc from wal; > The following scenario may cause the same 2pc transaction to be > added repeatedly, I have attached a patch for pg11 that reproduces > the error: > 1、start creating checkpoint_1, checkpoint_1.redo is set as > curInsert; > 2、before set checkPoint_1.nextXid, a new 2pc is prepared, suppose > the xid of this 2pc is 100, and then ShmemVariableCache->nextXid > will be advanced as 101; > 3、checkPoint_1.nextXid is set as 101; > 4、in CheckPointTwoPhase() of checkpoint_1, 2pc_100 won't be copied > to disk because its prepare_end_lsn > checkpoint_1.redo; > 5、checkPoint_1 is finished, after checkpoint_timeout, start > creating checkpoint_2; > 6、during checkpoint_2, data of 2pc_100 will be copied to disk; > 7、before UpdateControlFile() of checkpoint_2, crash happened; > 8、during crash recovery, redo will start from checkpoint_1, and > 2pc_100 will be restored first by restoreTwoPhaseData() because > xid_100 < checkPoint_1.nextXid, which is 101; > 9、because prepare_start_lsn of 2pc_100 > checkpoint_1.redo, 2pc_100 > will be added again by xact_redo() during wal replay, resulting in > the same 2pc data being added twice; It looks like you have something here. I'll try to look at it. This is a bug, so I have removed pgsql-hackers from the CC list keeping only pgsql-bugs as cross-list posts are not encouraged. -- Michael
Вложения
В списке pgsql-bugs по дате отправления: