On 8/1/20 18:13, Tom Lane wrote:
> You didn't actually say, but you must be interrupting parallel restores
> with SIGINT or the like?
Yes, CI (Jenkins) is interrupted automatically with new pushes and
that's supposed to send a SIGTERM to the process group, which includes
the pg_restore process.
> sigTermHandler tries to be safe to run in a signal context, but I'm
> afraid we didn't think hard about what exit() might call. The way
> I'd be inclined to fix this is to call _exit() instead of exit(),
> and the heck with what any atexit handlers think. Can you try that
> and see if it improves matters for you?
Initially I didn't like this idea since that means not cleaning up
gnutls stuff, and modifying things related to crypto is always scary;
but following the same reasoning, I trust that any good cryto library
shouldn't leak anything important due to a fast exit.
I'll set up some of the servers to use _exit() for some days and see if
that fixes it.
Thanks!
Raúl Marín.