Re: Fix crash during recovery when redo segment is missing

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Fix crash during recovery when redo segment is missing
Дата
Msg-id aT_LXeNULe_vClyg@paquier.xyz
обсуждение исходный текст
Ответ на Re: Fix crash during recovery when redo segment is missing  (Nitin Jadhav <nitinjadhavpostgres@gmail.com>)
Ответы Re: Fix crash during recovery when redo segment is missing
Список pgsql-hackers
On Wed, Dec 10, 2025 at 04:01:28PM +0530, Nitin Jadhav wrote:
> I have incorporated all the feedback discussed above and attached the v3 patch.
> Please take a look and let me know if you have any additional feedback.

I have been looking at this patch, and as far as I can see this is an
old problem, with consequences that vary depending on the branches
dealt with (looked at the code and tested down to 9.2, did not go
further down).

HEAD is actually not a bad student: if the REDO record is missing, we
crash on a pointer dereference.  Now, older branches have a different
story to offer, leading to a worse behavior.  For example on v14, our
oldest stable branch still supported: if the REDO record is missing
the startup process thinks that no REDO is required at all because
there is no record, and skips everything until the end of recovery.  I
have been lacking time today to look at the state of the branches
between v15 and v17, but I am pretty sure that we are fighting between
the pointer dereference and the v14 behavior for these as well.  v18
should be the pointer issue.

FWIW, this new ReadRecord() call has given me a pause for a few hours,
where I was wondering what is the effect of fetching_ckpt = false when
recovery or standby mode is requested.  At the end I think that we are
OK: if there is a restore_command and we don't have a backup_label, we
would be able to fetch the redo record from an archive, finishing
recovery even in this case where the backup_label is missing.  For
example, allowing recovery for the node in the test (recovery.signal +
restore_command) while copying the missing segment to archive
completes recovery, with consistency reached at the end of the
checkpoint and all the records between the redo point and the end of
checkpoint record, then a TLI jump happens.

Another set of things that were wrong in the patch, found during
review:
- The TAP test with injection points was failing for two memory
allocations attempted as the point is added in a critical section,
after the REDO record is generated.  There is one allocation for the
point loaded, due to the library path.  There was a second one due to
the wait machinery in the library injection_points for the DSM
registry and the shmem state we need.  Both of these problems can be
avoided with two techniques by using two points based on a wait: one
wait happens before the critical section, and is used to initialize
the shmem state.  The second is loaded outside the critical section,
run inside the critical section inside from the cache.
- The patch was actually wrong: we need to rely on the redo LSN as
found in the checkpoint record obtained a bit earlier.  There was at
least one regression due to that: recovery test 030 for the standby
node.

As a whole, this has to be backpatched, because we are just ignoring
recovery if the REDO record is simply missing while the checkpoint
record is found.  For the back branches, the PANIC is actually what
I'm planning to go with, to match with what is happening when the
checkpoint record is missing.  On HEAD, let's use a softer FATAL to
give a way to test this driver error moving forward.  There is a
secondary argument for softening the PANIC when the checkpoint record
is missing to a FATAL, but let's discuss that separately.  Hence that
would make two patches:
- Something simpler than the attached, without the test with a PANIC
for the redo record missing, for all the branches.
- A second patch lowering this PANIC to a FATAL, with the test
included, only for HEAD.

I have done a couple of tests in the CI and locally, and that was
looking stable.  Attached is the result of what would happen on HEAD,
where the change in xlogrecovery.c would include the back-branch
versions.

Thoughts or comments are welcome.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления: