Re: BUG #16199: pg_restore stuck on interrupts

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #16199: pg_restore stuck on interrupts
Дата
Msg-id 9845.1578503615@sss.pgh.pa.us
обсуждение исходный текст
Ответ на BUG #16199: pg_restore stuck on interrupts  (PG Bug reporting form <noreply@postgresql.org>)
Ответы Re: BUG #16199: pg_restore stuck on interrupts  (Raúl Marín <admin@rmr.ninja>)
Список pgsql-bugs
PG Bug reporting form <noreply@postgresql.org> writes:
> We are seeing stuck pg_restore processes in several of our CI servers, both
> with PG10 (10.2) and PG11 (11.5).

You didn't actually say, but you must be interrupting parallel restores
with SIGINT or the like?

> I have a some extra processes with the same issue  (7 full stacks out of 20,
> the others are garbage) and, from what I see, they all have in common that
> the process has received a signal while it was doing a memory operation,
> either a malloc or a free:

Yeah.  Ugh :-(

> I think if would be safer to use a similar approach to other processes, that
> is use the handler to only enable a global flag and check that in the main
> loop, but I'm having a hard time locating what the proper place to check the
> flag would be.

I think the odds of that being an improvement are minimal --- you'd be
trading a risk of failure during exit for a risk of not exiting (in any
timely fashion) in the first place.

sigTermHandler tries to be safe to run in a signal context, but I'm
afraid we didn't think hard about what exit() might call.  The way
I'd be inclined to fix this is to call _exit() instead of exit(),
and the heck with what any atexit handlers think.  Can you try that
and see if it improves matters for you?

            regards, tom lane



В списке pgsql-bugs по дате отправления:

Предыдущее
От: PG Bug reporting form
Дата:
Сообщение: BUG #16199: pg_restore stuck on interrupts
Следующее
От: Raúl Marín
Дата:
Сообщение: Re: BUG #16199: pg_restore stuck on interrupts