Re: VM corruption on standby
От | Kirill Reshke |
---|---|
Тема | Re: VM corruption on standby |
Дата | |
Msg-id | CALdSSPisWpkL+-_vS7B7vonX1XTC8aVkPhj3BBc2wtmuZ_a7cQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: VM corruption on standby (Yura Sokolov <y.sokolov@postgrespro.ru>) |
Список | pgsql-hackers |
On Tue, 19 Aug 2025 at 21:16, Yura Sokolov <y.sokolov@postgrespro.ru> wrote: > > That is not true. > elog(PANIC) doesn't clear LWLocks. And XLogWrite, which is could be called > from AdvanceXLInsertBuffer, may call elog(PANIC) from several places. > > It doesn't lead to any error, because usually postmaster is alive and it > will kill -9 all its children if any one is died in critical section. > > So the problem is postmaster is already killed with SIGKILL by definition > of the issue. > > Documentation says [0]: > > If at all possible, do not use SIGKILL to kill the main postgres server. > > Doing so will prevent postgres from freeing the system resources (e.g., > shared memory and semaphores) that it holds before terminating. > > Therefore if postmaster SIGKILL-ed, administrator already have to do some > actions. > There are surely many cases when a system reaches the state which can only be fixed by admin action. The elog(PANIC) in the CRIT section is very rare (and very probably is corruption already). The simpler example is to kill-9 postmaster and then immediately kill-9 someone who holds LWLock. The problem is in pgv18 is that this state probability is much higher due to the aforementioned commit. In can happen with almost any OOM on highly loaded systems. -- Best regards, Kirill Reshke
В списке pgsql-hackers по дате отправления: