Re: VM corruption on standby

Поиск

Список

Период

Сортировка

От	Thomas Munro
Тема	Re: VM corruption on standby
Дата	19 августа 08:31:36
Msg-id	CA+hUKGJfOGBf55oLsgvv1PZSuJm1+R8yFbVHsP3VnEu=dOqayQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: VM corruption on standby (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: VM corruption on standby Re: VM corruption on standby Re: VM corruption on standby
Список	pgsql-hackers

Дерево обсуждения

On Tue, Aug 19, 2025 at 4:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> But I'm of the opinion that proc_exit
> is the wrong thing to use after seeing postmaster death, critical
> section or no.  We should assume that system integrity is already
> compromised, and get out as fast as we can with as few side-effects
> as possible.  It'll be up to the next generation of postmaster to
> try to clean up.

Then wouldn't backends blocked in LWLockAcquire(x) hang forever, after
someone who holds x calls _exit()?

I don't know if there are other ways that LWLockReleaseAll() can lead
to persistent corruption that won't be corrected by crash recovery,
but this one is probably new since the following commit, explaining
the failure to reproduce on v17:

commit bc22dc0e0ddc2dcb6043a732415019cc6b6bf683
Author: Alexander Korotkov <akorotkov@postgresql.org>
Date:   Wed Apr 2 12:44:24 2025 +0300

    Get rid of WALBufMappingLock

Any idea involving deferring the handling of PM death from here
doesn't seem right: you'd keep waiting for the CV, but the backend
that would wake you might have exited.

Hmm, I wonder if there could be a solution in between where we don't
release the locks on PM exit, but we still wake the waiters so they
can observe a new dead state in the lock word (or perhaps a shared
postmaster_is_dead flag), and exit themselves.

Nice detective work Andrey and others!  That's a complicated and rare
interaction.

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: VM corruption on standby