Re: A failure in t/038_save_logical_slots_shutdown.pl

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: A failure in t/038_save_logical_slots_shutdown.pl
Дата
Msg-id CAA4eK1KABouS5FjKQsRB+OscWePqNyjPfD9YLY=GgHq4PdS_ig@mail.gmail.com
обсуждение исходный текст
Ответ на Re: A failure in t/038_save_logical_slots_shutdown.pl  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Ответы Re: A failure in t/038_save_logical_slots_shutdown.pl  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
On Thu, Jan 11, 2024 at 10:03 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Jan 11, 2024 at 4:35 PM vignesh C <vignesh21@gmail.com> wrote:
> >
> > On further analysis, it was found that in the failing test,
> > CHECKPOINT_SHUTDOWN was started in a new page, so there was the WAL
> > page header present just before the CHECKPOINT_SHUTDOWN which was
> > causing the failure. We could alternatively reproduce the issue by
> > switching the WAL file before restarting the server like in the
> > attached test change patch.
> > There are a couple of ways to fix this issue a) one by switching the
> > WAL before the insertion of records so that the CHECKPOINT_SHUTDOWN
> > does not get inserted in a new page as in the attached test_fix.patch
> > b) by using pg_walinspect to check that the next WAL record is
> > CHECKPOINT_SHUTDOWN. I have to try this approach.
> >
> > Thanks to Bharath and Kuroda-san for offline discussions and helping
> > in getting to the root cause.
>
> IIUC, the problem the commit e0b2eed tries to solve is to ensure there
> are no left-over decodable WAL records between confirmed_flush LSN and
> a shutdown checkpoint, which is what it is expected from the
> t/038_save_logical_slots_shutdown.pl. How about we have a PG function
> returning true if there are any decodable WAL records between the
> given start_lsn and end_lsn?
>

But, we already test this in 003_logical_slot during a successful
upgrade. Having an explicit test to do the same thing has some merits
but not sure if it is worth it. The current test tries to ensure that
during shutdown after we shutdown walsender and ensures that it sends
all the wal records and receipts an ack for the same, there is no
other WAL except shutdown_checkpoint. Vignesh's suggestion (a) makes
the test robust enough that it shouldn't show spurious failures like
the current one reported by you, so shall we try to proceed with that?

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Xiaoran Wang
Дата:
Сообщение: Re: Recovering from detoast-related catcache invalidations
Следующее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Adding facility for injection points (or probe points?) for more advanced tests