Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module
Дата
Msg-id CAGRY4nxsAe_1k_9g5b47orA0S011iBoHsXHFMH7cg7HV0O1bwQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Ответы Re: Use standard SIGHUP and SIGTERM handlers in autoprewarm module  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers


On Wed, Oct 7, 2020 at 8:39 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
On Wed, Oct 7, 2020 at 8:00 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Oct 6, 2020 at 11:41 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Tue, Oct 6, 2020 at 11:20 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > >
> > > +1 Or it's also ok to make each patch separately.
> > > Anyway, thanks!
> > >
> >
> > Thanks. +1 to have separate patches. I will post two separate patches
> > for autoprewarm and WalRcvShutdownHandler for further review. The
> > other two patches can be for startup.c and syslogger.c.


[ CC'd Robert Haas since he's the main name in interrupt.c, test_shm_mq/worker.c,  ]

src/test/modules/test_shm_mq/worker.c appears to do the right thing the wrong way - it has its own custom handler instead of using die() or SignalHandlerForShutdownRequest().

In contrast  src/test/modules/worker_spi/worker_spi.c looks plain wrong. Especially since it's quoted as an example of how to do things right. It won't respond to SIGTERM at all while it's executing a query from its queue, no matter how long that query takes or whether it blocks. It can inhibit even postmaster shutdown as a result.

I was going to lob off a quick patch to fix this by making both use quickdie() for SIGQUIT and die() for SIGTERM, but after reading for a bit I'm no longer sure what the right choice even is. I'd welcome some opinions.


The problem is that but interrupt.c and interrupt.h actually define and recommend different and simpler handlers for these jobs - ones that don't actually work properly for code that calls into Pg core and might rely on CHECK_FOR_INTERRUPTS(), InterruptsPending and ProcDiePending to properly respect SIGTERM.

And to add to the confusion the bgworker infra adds its own different default SIGTERM handler bgworker_die() that's weirdly in-between the interrupt.c and postgres.c signal handling.

So I'm no longer sure how the example code should even be fixed. I'm not convinced everyone using die() and quickdie() is good given they currently seem to be assumed to be mainly for user backends. Maybe wwe should move them to interrupt.c along with CHECK_FOR_INTERRUPTS(), ProcessInterrupts, etc and document them as for all database-connected or shmem-connected backends to use.

So in the medium term, interrupt.c's SignalHandlerForShutdownRequest() and SignalHandlerForCrashExit() should be combined with die() and quickdie(), integrating properly with CHECK_FOR_INTERRUPTS(), ProcessInterrupts(), etc. We can add a hook we call before we proc_exit() in response to ProcDiePending so backends can choose to mask it if there's a period during which they wish to defer responding to SIGTERM, but by default everything will respect SIGTERM -> die() sets ProcDiePending -> CHECK_FOR_INTERRUPTS() -> ProcessInterrupts() -> proc_exit() . interrupt.c's SignalHandlerForCrashExit() and SignalHandlerForShutdownRequest() become deprecated/legacy. We add a separate temporary handler that's installed by init.c for early SIGQUIT handling but document it as to be replaced after backends start properly. We'd delete the bgw-specific signal handlers and install die() and procdie() instead during StartBackgroundWorker - at least if the bgw is connecting to shmem or a database. interrupt.c's HandleMainLoopInterrupts() could be static inlined, and adopted in the bgworker examples and all the other places that currently do ConfigReloadPending / ProcessConfigFile() etc themselves.

It wouldn't be a clean sweep of consistent signal handling, given all the funky stuff we have in the checkpointer, walsender, etc. But I think it might help...

(And maybe I could even combine the various am_foo and is_bar globals we use to identify different sorts of backend, while doing such cleanups).

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: "unix_socket_directories" should be GUC_LIST_INPUT?
Следующее
От: Ian Lawrence Barwick
Дата:
Сообщение: Re: "unix_socket_directories" should be GUC_LIST_INPUT?