Re: [HACKERS] ERROR: infinite recursion in proc_exit

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] ERROR: infinite recursion in proc_exit
Дата
Msg-id 2486.941911693@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] ERROR: infinite recursion in proc_exit  (Bruce Momjian <maillist@candle.pha.pa.us>)
Ответы Re: [HACKERS] ERROR: infinite recursion in proc_exit  (Bruce Momjian <maillist@candle.pha.pa.us>)
Список pgsql-hackers
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> Fix applied:

>     /*
>      * If proc_exit is called too many times something bad is happening, so
>      * exit immediately.  This is crafted in two if's for a reason.
>      */
>     if (proc_exit_inprogress == 9)
>         elog(ERROR, "infinite recursion in proc_exit");
>     if (proc_exit_inprogress >= 9)
>         goto exit;

That isn't going to make things any better, because it's still laboring
under the same basic oversight: elog(ERROR) does not return to the
caller, it returns control to the main loop.  Thus, having proc_exit
call elog(ERROR) is simply *guaranteed* to create a failure.  proc_exit
must not allow control to return anywhere, under any circumstances,
because it is used as the final recourse when things are too screwed up
to consider continuing.

Although it's not hard to remove this silliness from proc_exit itself,
I suspect that the real source of the problem is probably elog(ERROR)
being called by one of the on_shmem_exit or on_proc_exit routines that
proc_exit is supposed to call.  I don't think we can hunt down and
eliminate all possible paths where that could happen (even if we could
do it today, it's too likely that future code changes would make it
possible again).

I think what we must do instead is to fix things so that if an
on_shmem_exit/on_proc_exit routine elog's, control comes back to
proc_exit and we carry on exiting with the remaining
on_shmem_exit/on_proc_exit routines (*not* calling the one that
failed again, of course).

A sketch of the necessary changes is:

1. proc_exit_inprogress becomes a global, and we change the
setjmp-catching code in postgres.c so that if proc_exit_inprogress is
nonzero, it just calls proc_exit() immediately.  This allows proc_exit
to recover control after an on_shmem_exit/on_proc_exit routine fails via
elog().  (Note: for elog(FATAL), it's already true that elog passes
control straight off to proc_exit.)

2. proc_exit and shmem_exit should decrement on_proc_exit_index and
on_shmem_exit_index as they work through the lists of registered
routines --- in effect, each routine is removed from the list just
before it is called.  That prevents the same routine from being called
again if it fails.

3. Ensure that the final exit() code is nonzero if we cycled through
proc_exit more than once --- errors during proc_exit should always be
treated as the equivalent of an outright crash, forcing a postmaster
cleanup cycle, I think.

Before I try to implement this scheme, can anyone spot a problem with
it?
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Re: New version of psql
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: [HACKERS] ERROR: infinite recursion in proc_exit