Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests
Дата	16 июня 2017 г. 00:09:48
Msg-id	CA+TgmoYx6mynFL5aDs7+xjZ01QrY8smp+Zr=5BxAseODZdZPWA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests
Список	pgsql-hackers

Дерево обсуждения

On Thu, Jun 15, 2017 at 5:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> Robert Haas <robertmhaas@gmail.com> writes:
>>> I think you're right.  So here's a theory:
>
>>> 1. The ERROR mapping the DSM segment is just a case of the worker the
>>> losing a race, and isn't a bug.
>
>> I concur that this is a possibility,
>
> Actually, no, it isn't.  I tried to reproduce the problem by inserting
> a sleep into ParallelWorkerMain, and could not.  After digging around
> in the code, I realize that the leader process *can not* exit the
> parallel query before the workers start, at least not without hitting
> an error first, which is not happening in these examples.  The reason
> is that nodeGather cannot deem the query done until it's seen EOF on
> each tuple queue, which it cannot see until each worker has attached
> to and then detached from the associated shm_mq.

Oh.  That's sad.  It definitely has to wait for any tuple queues that
have been attached to be detached, but it would be better if it didn't
have to wait for processes that haven't even attached yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests