Re: Startup PANIC on standby promotion due to zero-filled WAL segment
| От | Michael Paquier |
|---|---|
| Тема | Re: Startup PANIC on standby promotion due to zero-filled WAL segment |
| Дата | |
| Msg-id | aUuAbs_j2ifwvkky@paquier.xyz обсуждение исходный текст |
| Ответ на | Re: Startup PANIC on standby promotion due to zero-filled WAL segment (Alena Vinter <dlaaren8@gmail.com>) |
| Ответы |
Re: Startup PANIC on standby promotion due to zero-filled WAL segment
|
| Список | pgsql-hackers |
On Tue, Dec 23, 2025 at 08:49:20PM +0700, Alena Vinter wrote: > Michael, I left my pipeline running the TAP test until it failed — and > after some time, it did fail. I then changed the test slightly, and simply > by adding a short sleep, I was able to reproduce the same failure more > reliably. Moreover, attempting to restart the standby server after a failed > promotion triggers startup PANIC again. This is a better argument, yes. ProcessInterrupts() is just a way to force the WAL receiver to do nothing. We could see the same if a WAL receiver fails a palloc() or an allocation repeatedly, shutting it down before it is able to stream any changes, and we could also have a test with an injection point that forces an error based on a specific specific timeline number, or something like that. Hmm. Like in the case where the WAL receiver is not able to connect to a primary, shouldn't we prevent the promotion request to process at all? So while you have your finger on something here, I don't think that your suggested solution is a good nor correct one: it sounds to me that the startup process assumes that the WAL receiver is doing some work, then the promotion request comes it and we assume that it is OK to process through the promotion while we should obviously not do so, because the WAL receiver has streamed zero contents from TLI 2. It sounds to me that we should let the startup process know that something is wrong with the WAL receiver, meaning that it may be up to the WAL receiver to save some information in shared memory so as the startup process should not allow the promotion to go through at all. -- Michael
Вложения
В списке pgsql-hackers по дате отправления: