Re: Race conditions with checkpointer and shutdown

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Race conditions with checkpointer and shutdown
Дата
Msg-id 20190417000536.vxc6psciek4tntsx@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Race conditions with checkpointer and shutdown  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Race conditions with checkpointer and shutdown  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi,

On 2019-04-16 18:59:37 -0400, Robert Haas wrote:
> On Tue, Apr 16, 2019 at 6:45 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Do we need to think harder about establishing rules for multiplexed
> > use of the process latch?  I'm imagining some rule like "if you are
> > not the outermost event loop of a process, you do not get to
> > summarily clear MyLatch.  Make sure to leave it set after waiting,
> > if there was any possibility that it was set by something other than
> > the specific event you're concerned with".
>
> Hmm, yeah.  If the latch is left set, then the outer loop will just go
> through an extra and unnecessary iteration, which seems fine.  If the
> latch is left clear, then the outer loop might miss a wakeup intended
> for it and hang forever.

Arguably that's a sign that the latch using code in the outer loop(s) isn't
written correctly? If you do it as:

while (true)
{
    CHECK_FOR_INTERRUPTS();

    ResetLatch(MyLatch);

    if (work_needed)
    {
        Plenty();
        Code();
        Using(MyLatch);
    }
    else
    {
        WaitLatch(MyLatch);
    }
}

I think that's not a danger? I think the problem really is that we
suggest doing that WaitLatch() unconditionally:

 * The correct pattern to wait for event(s) is:
 *
 * for (;;)
 * {
 *       ResetLatch();
 *       if (work to do)
 *           Do Stuff();
 *       WaitLatch();
 * }
 *
 * It's important to reset the latch *before* checking if there's work to
 * do. Otherwise, if someone sets the latch between the check and the
 * ResetLatch call, you will miss it and Wait will incorrectly block.
 *
 * Another valid coding pattern looks like:
 *
 * for (;;)
 * {
 *       if (work to do)
 *           Do Stuff(); // in particular, exit loop if some condition satisfied
 *       WaitLatch();
 *       ResetLatch();
 * }

Obviously there's the issue that a lot of latch using code isn't written
that way - but I also don't think there's that many latch using code
that then also uses latch. Seems like we could fix that.  While it has
obviously dangers of not being followed, so does the
'always-set-latch-unless-outermost-loop' approach.

I'm not sure I like the idea of incurring another unnecessary SetLatch()
call for most latch using places.

I guess there's a bit bigger danger of taking longer to notice
postmaster-death. But I'm not sure I can quite see that being
problematic - seems like all we should incur is another cycle through
the loop, as the latch shouldn't be set anymore.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Race conditions with checkpointer and shutdown
Следующее
От: Andres Freund
Дата:
Сообщение: Re: Race conditions with checkpointer and shutdown