Re: VM corruption on standby

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: VM corruption on standby
Дата	19 августа 21:08:19
Msg-id	599759.1755626899@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: VM corruption on standby (Kirill Reshke <reshkekirill@gmail.com>)
Ответы	Re: VM corruption on standby
Список	pgsql-hackers

Дерево обсуждения

Kirill Reshke <reshkekirill@gmail.com> writes:
> On Tue, 19 Aug 2025 at 21:16, Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
>> `if (CritSectionCount != 0) _exit(2) else proc_exit(1)` in
>> WaitEventSetWaitBlock () solves the issue of inconsistency IF POSTMASTER IS
>> SIGKILLED, and doesn't lead to any problem, if postmaster is not SIGKILL-ed
>> (since postmaster will SIGKILL its children).

> This fix was proposed in this thread. It fixes inconsistency but it
> replaces one set of problems with another set, namely systems that
> fail to shut down.

I think a bigger objection is that it'd result in two separate
shutdown behaviors in what's already an extremely under-tested
(and hard to test) scenario.  I don't want to have to deal with
the ensuing state-space explosion.

I still think that proc_exit(1) is fundamentally the wrong thing
to do if the postmaster is gone: that code path assumes that
the cluster is still functional, which is at best shaky.
I concur though that we'd have to do some more engineering work
before _exit(2) would be a practical solution.

In the meantime, it seems like this discussion point arises
only because the presented test case is doing something that
seems pretty unsafe, namely invoking WaitEventSet inside a
critical section.

We'd probably be best off to get back to the actual bug the
thread started with, namely whether we aren't doing the wrong
thing with VM-update order of operations.

            regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: VM corruption on standby