Re: [HACKERS] Unportable implementation of background worker start

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] Unportable implementation of background worker start
Дата
Msg-id 17795.1492663813@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] Unportable implementation of background worker start  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [HACKERS] Unportable implementation of background worker start
Re: [HACKERS] Unportable implementation of background worker start
Список pgsql-hackers
I wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>> Tom Lane wrote:
>>> Hm.  Do you have a more-portable alternative?

>> I was thinking in a WaitEventSet from latch.c.

My first reaction was that that sounded like a lot more work than removing
two lines from maybe_start_bgworker and adjusting some comments.  But on
closer inspection, the slow-bgworker-start issue isn't the only problem
here.  On a machine that restarts select()'s timeout after an interrupt,
as (at least) HPUX does, the postmaster will actually never iterate
ServerLoop's loop except immediately after receiving a new connection
request.  The stream of interrupts from the autovac launcher is alone
sufficient to prevent the initial 60-second timeout from ever elapsing.
So if there are no new connection requests for awhile, none of the
housekeeping actions in ServerLoop get done.

Most of those actions are concerned with restarting failed background
tasks, which is something we could get by without --- it's unlikely
that those tasks would fail without causing a database-wide restart,
and then there really isn't going to be much need for them until at least
one new connection request has arrived.  But the last step in that loop is
concerned with touching the sockets and lock files to prevent aggressive
/tmp cleaners from removing them, and that's something that can't be let
slide, or we might as well not have the logic at all.

I've confirmed experimentally that on my HPUX box, a postmaster not
receiving new connections for an hour or more in fact fails to update
the mod times on the sockets and lock files.  So that problem isn't
hypothetical.

So it's looking to me like we do need to do something about this, and
ideally back-patch it all the way.  But WaitEventSet doesn't exist
before 9.6.  Do we have the stomach for back-patching that into
stable branches?
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: [HACKERS] Interval for launching the table sync worker
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables