Re: [HACKERS] parallel.c oblivion of worker-startup failures

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: [HACKERS] parallel.c oblivion of worker-startup failures
Дата
Msg-id CA+TgmoYGDtSNE_Csw7v=fzeMKyX33ohFvVgH=i3-mCS+=EAOhg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] parallel.c oblivion of worker-startup failures  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: [HACKERS] parallel.c oblivion of worker-startup failures
Re: [HACKERS] parallel.c oblivion of worker-startup failures
Список pgsql-hackers
On Tue, Jan 23, 2018 at 11:15 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Jan 22, 2018 at 10:56 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> It does turn such cases into error but only at the end when someone
>> calls WaitForParallelWorkersToFinish.  If one tries to rely on it at
>> any other time, it won't work as I think is the case for Parallel
>> Create Index where Peter G. is trying to use it differently.  I am not
>> 100% sure whether Parallel Create index has this symptom as I have not
>> tried it with that patch, but I and Peter are trying to figure that
>> out.
>
> OK.  I've committed this patch and back-patched it to 9.6.
> Back-patching to 9.5 didn't looks simple because nworkers_launched
> doesn't exist on that branch, and it doesn't matter very much anyway
> since the infrastructure has no in-core users in 9.5.

I was just talking to Thomas, and one or the other of us realized that
this doesn't completely fix the problem.  I think that if a worker
fails, even by some unorthodox mechanism like an abrupt proc_exit(1),
after the point where it attached to the error queue, this patch is
sufficient to make that reliably turn into an error.  But what if it
fails before that - e.g. fork() failure?  In that case, this process
will catch the problem if the calling process calls
WaitForParallelWorkersToFinish, but does that actually happen in any
interesting cases?  Maybe not.

In the case of Gather, for example, we won't
WaitForParallelWorkersToFinish() until we've read all the tuples.  If
there's a tuple queue that does not yet have a sender, then we'll just
wait for it to get one, even if the sender fell victim to a fork
failure and is never going to show up.

We could *almost* fix this by having execParallel.c include a Barrier
in the DSM, similar to what I proposed for parallel CREATE INDEX.
When a worker starts, it attaches to the barrier; when it exits, it
detaches.  Instead of looping until the leader is done and all the
tuple queues are detached, Gather or Gather Merge could loop until the
leader is done and everyone detaches from the barrier.  But that would
require changes to the Barrier API, which isn't prepared to have the
caller alternate waiting with other activity like reading the tuple
queues, which would clearly be necessary in this case.  Moreover, it
doesn't work at all when parallel_leader_participation=off, because in
that case we'll again just wait forever for some worker to show up,
and if none of them do, then we're hosed.

One idea to fix this is to add a new function in parallel.c with a
name like CheckForFailedWorkers() that Gather and Gather Merge call
just before they WaitLatch().  If a worker fails to start, the
postmaster will send SIGUSR1 to the leader, which will set the latch,
so we can count on the function being run after every worker death.
The function would go through and check each worker for which
pcxt->worker[i].error_mqh != NULL && !pcxt->any_message_received[i] to
see whether GetBackgroundWorkerPid(pcxt->worker[i].bgwhandle, &pid) ==
BGWH_STOPPED.  If so, ERROR.

This lets us *run* for arbitrary periods of time without detecting a
fork failure, but before we *wait* we will notice.  Getting more
aggressive than that sounds expensive, but I'm open to ideas.

If we adopt this approach, then Peter's patch could probably make use
of it, too.  It's a little tricky because ConditionVariableSleep()
tries not to return spuriously, so we'd either have to change that or
stick CheckForFailedWorkers() into that function's loop.  I suspect
the former is a better approach.  Or maybe that patch should still use
the Barrier approach and avoid all of this annoyance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: [HACKERS] Support for Secure Transport SSL library on macOS asOpenSSL alternative
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)