Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process
Дата
Msg-id CAEepm=0-kK3CLhX4zPRLBzvotTvxeYO_Eiwcz0R=4HZnpNugvw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [HACKERS] PATCH: Keep one postmaster monitoring pipe per process  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Sat, Nov 17, 2018 at 5:56 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
> > [ 0001-Add-WL_EXIT_ON_PM_DEATH-pseudo-event-v4.patch ]
>
> I took a quick look through this.  I have no objection to the idea of
> letting the latch infrastructure do the proc_exit(1), but I'm wondering
> why this is in the thread that it's in.  Is there any remaining connection
> to the original complaint about BSDen not coping well with lots of
> processes waiting on the same pipe?

It turned into a general clean-up of inconsistent postmaster death
handling after the one-pipe-per-backend proposal was rejected.
Perhaps I should have started a new thread, but it does in fact
address most cases of processes calling PostmasterIsAlive()
frequently, which was part of the original complaint.  That falls into
the last-mentioned category from this paragraph of the commit message:

  Repair all code that was previously ignoring postmaster death completely,
  or requesting the event but ignoring it, or requesting the event but then
  doing an unconditional PostmasterIsAlive() call every time through its
  event loop (which is an expensive syscall on platforms for which we don't
  have USE_POSTMASTER_DEATH_SIGNAL support).

Other potential improvements that could be made on both sides:

* We should probably be smarter about how often we call
PostmasterIsAlive() in the recovery loop, which is the only remaining
case like that (because it's a busy loop, given enough WAL to chew on,
so there is no waiting primitive that would report pipe readiness).

* It would be nice if more kernels supported parent death signals.

* Based on this report, those kernels really should sort out their
wakeup logic for duplicated pipe fds.  I wonder if other multi-process
applications have the same problem.  Our logger pipe and the proposed
checkpointer fsync pipe too, maybe, not sure.

> I'd advise making the code in places like this look like
>
>                 (void) WaitLatch(MyLatch, ...);
>
> Otherwise, you are likely to draw complaints about "return value is
> sometimes ignored" from Coverity and other static analyzers.  The
> (void) cast makes it explicit that you're intentionally ignoring
> the result value in this one place.

Done.

> Since exit_on_postmaster_death = true is going to be the normal case,
> it seems a bit unfortunate that we have to incur this looping every time
> we go through WaitEventSetWait.  Can't we put the handling of this in some
> spot where it only gets executed when we've detected WL_POSTMASTER_DEATH,
> like right after the PostmasterIsAliveInternal calls?
>
>             if (!PostmasterIsAliveInternal())
>             {
> +               if (set->exit_on_postmaster_death)
> +                   proc_exit(1);
>                 occurred_events->fd = PGINVALID_SOCKET;
>                 occurred_events->events = WL_POSTMASTER_DEATH;
>                 occurred_events++;
>                 returned_events++;
>             }

I was probably trying to avoid repeating the code in each of the 3
implementations of that function (poll, epoll, win32).  But there is
already other similar duplication there and I agree that is better.
Done that way.

Thanks for the review.

-- 
Thomas Munro
http://www.enterprisedb.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: pg11.1 jit segv
Следующее
От: Andres Freund
Дата:
Сообщение: Re: pg11.1 jit segv