Re: VM corruption on standby
От | Thomas Munro |
---|---|
Тема | Re: VM corruption on standby |
Дата | |
Msg-id | CA+hUKGJfOGBf55oLsgvv1PZSuJm1+R8yFbVHsP3VnEu=dOqayQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: VM corruption on standby (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: VM corruption on standby
Re: VM corruption on standby Re: VM corruption on standby |
Список | pgsql-hackers |
On Tue, Aug 19, 2025 at 4:52 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > But I'm of the opinion that proc_exit > is the wrong thing to use after seeing postmaster death, critical > section or no. We should assume that system integrity is already > compromised, and get out as fast as we can with as few side-effects > as possible. It'll be up to the next generation of postmaster to > try to clean up. Then wouldn't backends blocked in LWLockAcquire(x) hang forever, after someone who holds x calls _exit()? I don't know if there are other ways that LWLockReleaseAll() can lead to persistent corruption that won't be corrected by crash recovery, but this one is probably new since the following commit, explaining the failure to reproduce on v17: commit bc22dc0e0ddc2dcb6043a732415019cc6b6bf683 Author: Alexander Korotkov <akorotkov@postgresql.org> Date: Wed Apr 2 12:44:24 2025 +0300 Get rid of WALBufMappingLock Any idea involving deferring the handling of PM death from here doesn't seem right: you'd keep waiting for the CV, but the backend that would wake you might have exited. Hmm, I wonder if there could be a solution in between where we don't release the locks on PM exit, but we still wake the waiters so they can observe a new dead state in the lock word (or perhaps a shared postmaster_is_dead flag), and exit themselves. Nice detective work Andrey and others! That's a complicated and rare interaction.
В списке pgsql-hackers по дате отправления: