Re: Parallel query vs smart shutdown and Postmaster death

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Parallel query vs smart shutdown and Postmaster death
Дата
Msg-id CA+hUKG+MF0G7f8UKvTWiGs4iFng5bA_jL8RT4X2WdhP+oE8gkg@mail.gmail.com
обсуждение исходный текст
Ответ на Parallel query vs smart shutdown and Postmaster death  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: Parallel query vs smart shutdown and Postmaster death  (Robert Haas <robertmhaas@gmail.com>)
Re: Parallel query vs smart shutdown and Postmaster death  (Arseny Sher <a.sher@postgrespro.ru>)
Список pgsql-hackers
On Mon, Feb 25, 2019 at 2:13 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> 1.  In a nearby thread, I misdiagnosed a problem reported[1] by Justin
> Pryzby (though my misdiagnosis is probably still a thing to be fixed;
> see next).  I think I just spotted the real problem he saw: if you
> execute a parallel query after a smart shutdown has been initiated,
> you wait forever in gather_readnext()!  Maybe parallel workers can't
> be launched in this state, but we lack code to detect this case?  I
> haven't dug into the exact mechanism or figured out what to do about
> it yet, and I'm tied up with something else for a bit, but I will come
> back to this later if nobody beats me to it.

Given smart shutdown's stated goal, namely that it "lets existing
sessions end their work normally", my questions are:

1.  Why does pmdie()'s SIGTERM case terminate parallel workers
immediately?  That breaks aborts running parallel queries, so they
don't get to end their work normally.
2.  Why are new parallel workers not allowed to be started while in
this state?  That hangs future parallel queries forever, so they don't
get to end their work normally.
3.  Suppose we fix the above cases; should we do it for parallel
workers only (somehow), or for all bgworkers?  It's hard to say since
I don't know what all bgworkers do.

In the meantime, perhaps we should teach the postmaster to report this
case as a failure to launch in back-branches, so that at least
parallel queries don't hang forever?  Here's an initial sketch of a
patch like that: it gives you "ERROR:  parallel worker failed to
initialize" and "HINT:  More details may be available in the server
log." if you try to run a parallel query.  The HINT is right, the
server logs say that a smart shutdown is in progress.  If that seems a
bit hostile, consider that any parallel queries that were running at
the moment the smart shutdown was requested have already been ordered
to quit; why should new queries started after that get a better deal?
Then perhaps we could do some more involved surgery on master that
achieves smart shutdown's stated goal here, and lets parallel queries
actually run?  Better ideas welcome.

-- 
Thomas Munro
https://enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Allowing extensions to supply operator-/function-specific info
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: Segfault when restoring -Fd dump on current HEAD