Re: Missed check for too-many-children in bgworker spawning

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Missed check for too-many-children in bgworker spawning
Дата
Msg-id CA+TgmoZ4ha+jv7o_NzvzecXLkZpxPEbswVB=wZAp9rLmBnqG6A@mail.gmail.com
обсуждение исходный текст
Ответ на Missed check for too-many-children in bgworker spawning  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Missed check for too-many-children in bgworker spawning  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Sun, Oct 6, 2019 at 1:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Over in [1] we have a report of a postmaster shutdown that seems to
> have occurred because some client logic was overaggressively spawning
> connection requests, causing the postmaster's child-process arrays to
> be temporarily full, and then some parallel query tried to launch a
> new bgworker process.  The postmaster's bgworker-spawning logic lacks
> any check for the arrays being full, so when AssignPostmasterChildSlot
> failed to find a free slot, kaboom!
>
> The attached proposed patch fixes this by making bgworker spawning
> include a canAcceptConnections() test.  That's perhaps overkill, since
> we really just need to check the CountChildren() total; but I judged
> that going through that function and having it decide what to test or
> not test was a better design than duplicating the CountChildren() test
> elsewhere.
>
> I'd first imagined also replacing the one-size-fits-all check
>
>     if (CountChildren(BACKEND_TYPE_ALL) >= MaxLivePostmasterChildren())
>         result = CAC_TOOMANY;
>
> with something like
>
>     switch (backend_type)
>     {
>         case BACKEND_TYPE_NORMAL:
>             if (CountChildren(backend_type) >= 2 * MaxConnections)
>                 result = CAC_TOOMANY;
>             break;
>         case BACKEND_TYPE_AUTOVAC:
>             if (CountChildren(backend_type) >= 2 * autovacuum_max_workers)
>                 result = CAC_TOOMANY;
>             break;
>         ...
>     }
>
> so as to subdivide the pool of child-process slots and prevent client
> requests from consuming slots meant for background processes.  But on
> closer examination that's not really worth the trouble, because this
> pool is already considerably bigger than MaxBackends; so even if we
> prevented a failure here we could still have bgworker startup failure
> later on when it tries to acquire a PGPROC.
>
> Barring objections, I'll apply and back-patch this soon.

I think it used to work this way -- not sure if it was ever committed
this way, but it at least did during development -- and we ripped it
out because somebody (Magnus?) pointed out that if you got close to
the connection limit, you could see parallel queries start failing,
and that would suck. Falling back to non-parallel seems more OK in
that situation than actually failing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: PATCH: Add uri percent-encoding for binary data
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: expressive test macros (was: Report test_atomic_ops() failuresconsistently, via macros)