Re: Clock sweep not caching enough B-Tree leaf pages?

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Clock sweep not caching enough B-Tree leaf pages?
Дата
Msg-id 20140416075307.GC3906@awork2.anarazel.de
обсуждение исходный текст
Ответ на Clock sweep not caching enough B-Tree leaf pages?  (Peter Geoghegan <pg@heroku.com>)
Ответы Re: Clock sweep not caching enough B-Tree leaf pages?  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
Hi,

It's good to see focus on this - some improvements around s_b are sorely
needed.

On 2014-04-14 10:11:53 -0700, Peter Geoghegan wrote:
> 1) Throttles incrementation of usage_count temporally. It becomes
> impossible to increment usage_count for any given buffer more
> frequently than every 3 seconds, while decrementing usage_count is
> totally unaffected.

I think this is unfortunately completely out of question. For one a
gettimeofday() for every uffer pin will become a significant performance
problem. Even the computation of the xact/stm start/stop timestamps
shows up pretty heavily in profiles today - and they are far less
frequent than buffer pins. And that's on x86 linux, where gettimeofday()
is implemented as something more lightweight than a full syscall.

The other significant problem I see with this is that its not adaptive
to the actual throughput of buffers in s_b. In many cases there's
hundreds of clock cycles through shared buffers in 3 seconds. By only
increasing the usagecount that often you've destroyed the little
semblance to a working LRU there is right now.

It also wouldn't work well for situations with a fast changing
workload >> s_b. If you have frequent queries that take a second or so
and access some data repeatedly (index nodes or whatnot) only increasing
the usagecount once will mean they'll continually fall back to disk access.

> 2) Has usage_count saturate at 10 (i.e. BM_MAX_USAGE_COUNT = 10), not
> 5 as before. ... . This step on its own would be assumed extremely
> counter-productive by those in the know, but I believe that other
> measures ameliorate the downsides. I could be wrong about how true
> that is in other cases, but then the case helped here isn't what you'd
> call a narrow benchmark.

I don't see which mechanisms you have suggested that counter this?

I think having more granular usagecount is a good idea, but I don't
think it can realistically be implemented with the current method of
choosing victim buffers. The amount of cacheline misses around that is
already a major scalability limit; we surely can't make this even
worse. I think it'd be possible to get back to this if we had a better
bgwriter implementation.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: Re: Proposal: variant of regclass
Следующее
От: Boszormenyi Zoltan
Дата:
Сообщение: Re: ECPG FETCH readahead