Re: sinval synchronization considered harmful

Поиск
Список
Период
Сортировка
От Noah Misch
Тема Re: sinval synchronization considered harmful
Дата
Msg-id 20110726203804.GB17857@tornado.leadboat.com
обсуждение исходный текст
Ответ на Re: sinval synchronization considered harmful  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: sinval synchronization considered harmful
Список pgsql-hackers
On Tue, Jul 26, 2011 at 01:36:26PM -0400, Robert Haas wrote:
> On Mon, Jul 25, 2011 at 6:24 PM, Noah Misch <noah@2ndquadrant.com> wrote:
> > On Fri, Jul 22, 2011 at 03:54:03PM -0400, Robert Haas wrote:
> >> On Fri, Jul 22, 2011 at 3:28 PM, Noah Misch <noah@2ndquadrant.com> wrote:
> >> > This is attractive, and I don't see any problems with it.  (In theory, you could
> >> > hit a case where the load of resetState gives an ancient "false" just as the
> >> > counters wrap to match.  Given that the wrap interval is 1000000x as long as the
> >> > reset interval, I'm not worried about problems on actual silicon.)
> >>
> >> It's actually 262,144 times as long - see MSGNUMWRAPAROUND.
> >
> > Ah, so it is.
> >
> >> It would be pretty easy to eliminate even the theoretical possibility
> >> of a race by getting rid of resetState altogether and using nextMsgNum
> >> = -1 to mean that.  Maybe I should go ahead and do that.
> >
> > Seems like a nice simplification.
> 
> On further reflection, I don't see that this helps: it just moves the
> problem around.

Alas, yes.

> With resetState as a separate variable, nextMsgNum is
> never changed by anyone other than the owner, so we can never have a
> stale load.

It's also changed when SICleanupQueue() decides to wrap everyone.  This could
probably be eliminated by using uint32 and letting overflow take care of wrap
implicitly, but ...

> But if we overload nextMsgNum to also indicate whether
> our state has been reset, then there's a race between when we load
> nextMsgNum and when we load maxMsgNum (instead of code I posted
> previously, which has a race between when we load resetState and when
> we load maxMsgNum).  Now, as you say, it seems really, really
> difficult to hit that in practice, but I don't see a way of getting
> rid of the theoretical possibility without either (1) a spinlock or
> (2) a fence.  (Of course, on x86, the fence could be optimized down to
> a compiler barrier.)

No new ideas come to mind, here.  We could migrate back toward your original
proposal of making the counter a non-wrapping uint64; Florian described some
nice techniques for doing atomic 64-bit reads on x86:

http://archives.postgresql.org/message-id/7C94C386-122F-4918-8624-A14352E8DBC5@phlo.org

I'm not personally acquainted with those approaches, but they sound promising.
Since the overlap between 32-bit installations and performance-sensitive
installations is all but gone, we could even just use a spinlock under 32-bit.

> I guess the question is "should we worry about that?".

I'd lean toward "no".  I share Tom's unease about writing off a race condition
as being too improbable, but this is quite exceptionally improbable.  On the
other hand, some of the fixes don't look too invasive.

-- 
Noah Misch                    http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: sinval synchronization considered harmful
Следующее
От: Florian Pflug
Дата:
Сообщение: XMLATTRIBUTES vs. values of type XML