RE: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages
От | Hayato Kuroda (Fujitsu) |
---|---|
Тема | RE: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages |
Дата | |
Msg-id | OSCPR01MB1496661B939CC826215F370D0F546A@OSCPR01MB14966.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: pg_logical_slot_get_changes waits continously for a partial WAL record spanning across 2 pages (vignesh C <vignesh21@gmail.com>) |
Список | pgsql-hackers |
Dear Vignesh, > I was unable to reproduce the same test failure on the PG17 branch, > even after running the test around 500 times. However, on the master > branch, the failure consistently reproduces approximately once in > every 50 runs. I also noticed that while the buildfarm has reported > multiple failures for this test for the master branch, none of them > appear to be on the PG17 branch. I'm not yet sure why this discrepancy > exists. I was also not able to reproduce as-is. After analyzing bit more, I found on PG17, the workload cannot generate an FPI_FOR_HINT. The type of WAL record has longer length than the page there was a possibility that the WAL record could be flushed partially in HEAD. But in PG17 it could not happen so that OVERWRITE_CONTRECORD won't be appeared. I modified the test code like [1] and confirmed that the same stuck could happen on PG17. It generates a long record which can go across the page and can be flushed partially. [1]: ``` --- a/src/test/recovery/t/046_checkpoint_logical_slot.pl +++ b/src/test/recovery/t/046_checkpoint_logical_slot.pl @@ -123,6 +123,10 @@ $node->safe_psql('postgres', $node->safe_psql('postgres', q{select injection_points_wakeup('checkpoint-before-old-wal-removal')}); +# Generate a long WAL record +$node->safe_psql('postgres', + q{select pg_logical_emit_message(false, '', repeat('123456789', 1000))}); ``` Best regards, Hayato Kuroda FUJITSU LIMITED
В списке pgsql-hackers по дате отправления: