Re: rare avl shutdown slowness (related to signal handling)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: rare avl shutdown slowness (related to signal handling)
Дата
Msg-id 12815.1428442357@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: rare avl shutdown slowness (related to signal handling)  (Qingqing Zhou <zhouqq.postgres@gmail.com>)
Ответы Re: rare avl shutdown slowness (related to signal handling)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: rare avl shutdown slowness (related to signal handling)  (Qingqing Zhou <zhouqq.postgres@gmail.com>)
Список pgsql-hackers
Qingqing Zhou <zhouqq.postgres@gmail.com> writes:
>  I got another repro with the shutdown slowness (DEBUG5 with verbosed
> log are attached).

> It gives a finer picture of what's going on:
> 1. Avl ereport("autovacuum launcher shutting down");
> 2. At the end of errfinish(), it honors a pending SIGINT;
> 3. SIGINT handler longjmp to the start of avl error handling;
> 4.  The error handling continues and rebuild_database_list() (that's
> why we see begin/commit pair);
> 5. In main loop, it WaitLatch(60 seconds);
> 6. Finally it ereport() again and proc_exit().

> This looks like a general pattern - don't think *nix is immune. Notice
> that this ereport() is special as there is way to go back. So we can
> insert HOLD_INTERRUPTS() just before it.

> Thoughts?

That seems like (a) a hack, and (b) not likely to solve the problem
completely, unless you leave interrupts held throughout proc_exit(),
which would create all sorts of opportunities for corner case bugs
during on_proc_exit hooks.

I think changing the outer "for(;;)" to "while (!got_SIGTERM)" would
be a much safer fix.

It looks like there's a related risk associated with this bit:
/* in emergency mode, just start a worker and go away */if (!AutoVacuumingActive()){    do_start_worker();
proc_exit(0);           /* done */}
 

If we get SIGHUP and see that autovacuum has been turned off,
we exit the main loop, but we don't set got_SIGTERM.  So if we
then get a similar error at the shutdown report, we'd not merely
waste some time, but actually incorrectly launch a child.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Qingqing Zhou
Дата:
Сообщение: Re: rare avl shutdown slowness (related to signal handling)
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: rare avl shutdown slowness (related to signal handling)