Re: WIP: dynahash replacement for buffer table

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: WIP: dynahash replacement for buffer table
Дата
Msg-id 20141014153120.GI9267@awork2.anarazel.de
обсуждение исходный текст
Ответ на Re: WIP: dynahash replacement for buffer table  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: WIP: dynahash replacement for buffer table
Список pgsql-hackers
On 2014-10-14 11:08:08 -0400, Robert Haas wrote:
> On Tue, Oct 14, 2014 at 10:47 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2014-10-14 09:30:58 -0400, Robert Haas wrote:
> >> I took the basic infrastructure from before and used it to replace the
> >> buffer table.  Patch is attached.
> >
> > Hm. Unless I missed something you pretty much completely eradicated
> > locks from the buffer mapping code? That'd be pretty cool, but it's also
> > scary.
> >
> > How confident are you that your conversion is correct? Not in the sense
> > that there aren't any buglets, but fundamentally.
> 
> It doesn't look particularly dangerous to me.  Famous last words.

> Basically, I think what we're doing right now is holding the buffer
> mapping lock so that the buffer can't be renamed under us while we're
> pinning it.

What I'm afraid of is that there's hidden assumptions about the
consistency provided by the mapping locks.

> If we don't do that, I think the only consequence is
> that, by the time we get the buffer pin, the buffer might no longer be
> the one we looked up.  So we need to recheck.  But assuming we do
> that, I don't see an issue.  In fact, it occurred to me while I was
> cobbling this together that we might want to experiment with that
> change independently of chash.  We already know (from the
> StrategyGetBuffer locking changes) that holding system-wide locks to
> prevent people from twaddling the state of individual buffers can be a
> loser.  This case isn't nearly as bad, because we're only pinning one
> buffer, rather than potentially all of them multiple times, and the
> lock we're holding only affects 1/128th of the buffers, not all of
> them.

Also IIRC at least linux has targeted wakeup/time transfer logic when
using semaphore - and doesn't for spinlocks. So if you're not happening
to sleep while the lwlock's spinlock is held, the result will be
better. Only that we'll frequently sleep within that for very frequently
taken locks...

> The other application I had in mind for chash was SLRU stuff.  I
> haven't tried to write the code yet, but fundamentally, the same
> considerations apply there.  You do the lookup locklessly to find out
> which buffer is thought to contain the SLRU page you want, but by the
> time you lock the page the answer might be out of date.  Assuming that
> this doesn't happen many times in a row and that lookups are
> relatively cheap, you could get rid of having any centralized lock for
> SLRU, and just have per-page locks.

Hm. I have to admit I haven't really looked enough into the slru code to
judge this. My impression so far wasn't that the locking itself was the
problem in most scenarios - that's not what you've seen?

> I'm not quite sure what fundamental dangers you're thinking about
> here

Oh, only in the context of the bufmgr.c conversion. Not more
generally. I agree that a lockfree hashtable is something quite
worthwile to have.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: WIP: dynahash replacement for buffer table
Следующее
От: Robert Haas
Дата:
Сообщение: Re: WIP: dynahash replacement for buffer table