Re: VM corruption on standby
От | Tom Lane |
---|---|
Тема | Re: VM corruption on standby |
Дата | |
Msg-id | 599759.1755626899@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: VM corruption on standby (Kirill Reshke <reshkekirill@gmail.com>) |
Ответы |
Re: VM corruption on standby
|
Список | pgsql-hackers |
Kirill Reshke <reshkekirill@gmail.com> writes: > On Tue, 19 Aug 2025 at 21:16, Yura Sokolov <y.sokolov@postgrespro.ru> wrote: >> `if (CritSectionCount != 0) _exit(2) else proc_exit(1)` in >> WaitEventSetWaitBlock () solves the issue of inconsistency IF POSTMASTER IS >> SIGKILLED, and doesn't lead to any problem, if postmaster is not SIGKILL-ed >> (since postmaster will SIGKILL its children). > This fix was proposed in this thread. It fixes inconsistency but it > replaces one set of problems with another set, namely systems that > fail to shut down. I think a bigger objection is that it'd result in two separate shutdown behaviors in what's already an extremely under-tested (and hard to test) scenario. I don't want to have to deal with the ensuing state-space explosion. I still think that proc_exit(1) is fundamentally the wrong thing to do if the postmaster is gone: that code path assumes that the cluster is still functional, which is at best shaky. I concur though that we'd have to do some more engineering work before _exit(2) would be a practical solution. In the meantime, it seems like this discussion point arises only because the presented test case is doing something that seems pretty unsafe, namely invoking WaitEventSet inside a critical section. We'd probably be best off to get back to the actual bug the thread started with, namely whether we aren't doing the wrong thing with VM-update order of operations. regards, tom lane
В списке pgsql-hackers по дате отправления: