Re: [HACKERS] parallel.c oblivion of worker-startup failures
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] parallel.c oblivion of worker-startup failures |
Дата | |
Msg-id | CAH2-Wz=TnL4xP4cuEqQy-bCzQJoyKR3Ec3dsMbHPvagroY=isw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] parallel.c oblivion of worker-startup failures (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: [HACKERS] parallel.c oblivion of worker-startup failures
|
Список | pgsql-hackers |
On Wed, Jan 24, 2018 at 12:05 PM, Robert Haas <robertmhaas@gmail.com> wrote: > In Thomas's test case, he's using 4 workers, and if even one of those > manages to start, then it'll probably do so after any fork failures > have already occurred, and the error will be noticed when that process > sends its first message to the leader through the error queue, because > it'll send PROCSIG_PARALLEL_MESSAGE via all the queues. If all of the > workers fail to start, then that doesn't help. But it still manages > to detect the failure in that case because it reaches > WaitForParallelWorkersToFinish, which we just patched. > > But how does that happen, anyway? Shouldn't it get stuck waiting for > the tuple queues to which nobody's attached yet? The reason it > doesn't is because > ExecParallelCreateReaders() calls shm_mq_set_handle() for each queue, > which causes the tuple queues to be regarded as detached if the worker > fails to start. A detached tuple queue, in general, is not an error > condition: it just means that worker has no more tuples. This explains all the confusion. Amit told me that using a tuple queue made all the difference here. Even till, it seemed surprising that we'd rely on that from such a long distance from within nodeGather.c. > I guess that works, but it seems more like blind luck than good > design. Parallel CREATE INDEX fails to be as "lucky" as Gather > because there's nothing in parallel CREATE INDEX that lets it skip > waiting for a worker which doesn't start -- and in general it seems > unacceptable to impose a coding requirement that all future parallel > operations must fail to notice the difference between a worker that > completed successfully and one that never ran at all. +1. > If we made the Gather node wait only for workers that actually reached > the Gather -- either by using a Barrier or by some other technique -- > then this would be a lot less fragile. And the same kind of technique > would work for parallel CREATE INDEX. The use of a barrier has problems of its own [1], though, of which one is unique to the parallel_leader_participation=off case. I thought that you yourself agreed with this [2] -- do you? Another argument against using a barrier in this way is that it seems like way too much mechanism to solve a simple problem. Why should a client of parallel.h not be able to rely on nworkers_launched (perhaps only after "verifying it can be trusted")? That seem like a pretty reasonable requirement for clients to have for any framework for parallel imperative programming. I think that we should implement "some other technique", instead of using a barrier. As I've said, Amit's WaitForParallelWorkersToAttach() idea seems like a promising "other technique". [1] https://www.postgresql.org/message-id/CAA4eK1+a0OF4M231vBgPr_0Ygg_BNmRGZLiB7WQDE-FYBSyrGg@mail.gmail.com [2] https://www.postgresql.org/message-id/CA+TgmoaF8UA8v8hP=CcoqUc50pucPC8ABj-_yyC++yGggjWFsw@mail.gmail.com -- Peter Geoghegan
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Stephen FrostДата:
Сообщение: Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump
Следующее
От: Tom LaneДата:
Сообщение: Re: Converting plpgsql to use DTYPE_REC for named composite types