Re: Lockless StrategyGetBuffer() clock sweep

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Lockless StrategyGetBuffer() clock sweep
Дата
Msg-id CAA4eK1LnhEMNd2QWCLYVPQzMeRs+jdZFsuj+e_QHPdC3vYgveg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Lockless StrategyGetBuffer() clock sweep  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: Lockless StrategyGetBuffer() clock sweep
Список pgsql-hackers
On Thu, Oct 30, 2014 at 12:39 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2014-10-29 14:18:33 -0400, Robert Haas wrote:
>
> > The interaction between this and the bgreclaimer idea is interesting.
> > We can't making popping the freelist lockless without somehow dealing
> > with the resulting A-B-A problem (namely, that between the time we
> > read &head->next and the time we CAS the list-head to that value, the
> > head might have been popped, another item pushed, and the original
> > list head pushed again).
>
> I think if we really feel the need to, we can circumvent the ABA problem
> here. But I'm not yet convinced that there's the need to do so.  I'm
> unsure that a single process that touches all the buffers at some
> frequency is actually a good idea on modern NUMA systems...
>
> I wonder if we could make a trylock for spinlocks work - then we could
> look at the freelist if the lock is free and just go to the clock cycle
> otherwise. My guess is that that'd be quite performant.  IIRC it's just
> the spinlock semaphore fallback that doesn't know how to do trylock...
>
>
> > So even if bgreclaimer saves some work for
> > individual backends - avoiding the need for them to clock-sweep across
> > many buffers - it may not be worth it if it means taking a spinlock to
> > pop the freelist instead of using an atomics-driven clock sweep.
> > Considering that there may be a million plus buffers to scan through,
> > that's a surprising conclusion, but it seems to be where the data is
> > pointing us.
>
> I'm not really convinced of this yet either. It might just be that the
> bgreclaimer implementation isn't good enough. But to me that doesn't
> really change the fact that there's clear benefit in this patch - 

I have a feeling that this might also have some regression at higher
loads (like scale_factor = 5000, shared_buffers = 8GB, 
client_count = 128, 256) for the similar reasons as bgreclaimer patch,
means although both reduces contention around spin lock, however
it moves contention somewhere else.  I have yet to take data before
concluding anything (I am just waiting for your other patch (wait free
LW_SHARED) to be committed).
I think once wait free LW_SHARED is in, we can evaluate this patch
(if required may be we can see if there is any interaction between
this and bgreclaimer).  However if you want, I think this can be done
even separately from wait free LW_SHARED patch.

> even
> if we can make bgreclaimer beneficial the lockless scan will be good.
>
> My feeling is that make bg*writer* more efficient is more likely to be
> beneficial overall than introducing bgreclaimer. 

One idea could be that we bgwriter as something similar to auto vacuum
launcher, which means that bgwriter can launch different workers based
on the kind of need.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: group locking: incomplete patch, just for discussion
Следующее
От: Noah Misch
Дата:
Сообщение: Re: TAP test breakage on MacOS X