Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests
Дата	16 июня 2017 г. 03:06:41
Msg-id	9641.1497560801@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

I wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I think you're right.  So here's a theory:

>> 1. The ERROR mapping the DSM segment is just a case of the worker the
>> losing a race, and isn't a bug.

> I concur that this is a possibility,

Actually, no, it isn't.  I tried to reproduce the problem by inserting
a sleep into ParallelWorkerMain, and could not.  After digging around
in the code, I realize that the leader process *can not* exit the
parallel query before the workers start, at least not without hitting
an error first, which is not happening in these examples.  The reason
is that nodeGather cannot deem the query done until it's seen EOF on
each tuple queue, which it cannot see until each worker has attached
to and then detached from the associated shm_mq.

(BTW, this also means that the leader is frozen solid if a worker
process fails to start, but we knew that already.)

So we still don't know why lorikeet is sometimes reporting "could not map
dynamic shared memory segment".  It's clear though that once that happens,
the current code has no prayer of recovering cleanly.  It looks from
lorikeet's logs like there is something that is forcing a timeout via
crash after ~150 seconds, but I do not know what that is.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Robert Haas
Дата: 16 июня 2017 г., 03:04:17
Сообщение: Re: [HACKERS] WIP: Data at rest encryption

Следующее

От: Robert Haas
Дата: 16 июня 2017 г., 03:08:23
Сообщение: [HACKERS] pg_waldump command line arguments

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests

Предыдущее

Следующее