Re: BUG #17928: Standby fails to decode WAL on termination of primary

Поиск

Список

Период

Сортировка

От	Michael Paquier
Тема	Re: BUG #17928: Standby fails to decode WAL on termination of primary
Дата	2 сентября 2023 г. 04:29:39
Msg-id	ZPKQA+HHqLVRFBSs@paquier.xyz обсуждение исходный текст
Ответ на	Re: BUG #17928: Standby fails to decode WAL on termination of primary (Michael Paquier <michael@paquier.xyz>)
Ответы	Re: BUG #17928: Standby fails to decode WAL on termination of primary
Список	pgsql-bugs

Дерево обсуждения

On Mon, Aug 21, 2023 at 08:32:39AM +0900, Michael Paquier wrote:
> I am not sure that I'll be able to do more on this topic this week, at
> least that's some progress.

Some time later..

I have spent more time and thoughts on the whole test suite, finishing
with the attached 0003 that applies on top of your own patches.  I am
really looking forward to making this whole logic more robust, so as
WAL replay can be made itself more robust for the OOM/end-of-wal
detection on HEAD for standbys and crash recovery.

While looking at the whole, I have considered a few things that may
make the test cleaner, like:
- Calculating the segment name and its offset from the end_lsn of a
record directly from the backend, but it felt inelegant to pass
through more subroutine layers the couple ($segment, offset) rather
than just a LSN, so guessing the segment number and the offset while
the cluster is offline if OK by me.
- The TLI can be queried from the server rather than hardcoded.
- I've been thinking about bundling the tests of each sub-section in
their own subroutine, but that felt a bit awkward, particularly for
the part where we need a correct $prev_lsn in the record header
written to enforce other code paths.
- The test needs better documentation.  One of the things I kept
staring at is cross-checking pack() and the dependency to the C
structures, so I have added more details there, explaining more the
whys and the hows.

I have also looked again at the C code for a few hours, and still got
the impression that this is rather solid.  There are two things that
may be better:
- Document at the top of allocate_recordbuf() that this should never
be called with a length coming from a header until it is validated.
- Removing AllocSizeIsValid() for the non-FRONTEND path should be OK.

What do you think?
--
Michael

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее

От: Michael Paquier
Дата: 02 сентября 2023 г., 03:08:22
Сообщение: Re: BUG #17973: Reinit of pgstats entry for dropped DB can break autovacuum daemon

Следующее

От: Sergei Kornilov
Дата: 02 сентября 2023 г., 14:51:48
Сообщение: Re: BUG #17928: Standby fails to decode WAL on termination of primary

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #17928: Standby fails to decode WAL on termination of primary

Вложения

Предыдущее

Следующее