Re: our buffer replacement strategy is kind of lame

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: our buffer replacement strategy is kind of lame
Дата
Msg-id CA+TgmoaNQ4G_7ciLFMC4Ei=1ao-Ps=p-Ej0QFByjhiXmOmi5hA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: our buffer replacement strategy is kind of lame  (Simon Riggs <simon@2ndQuadrant.com>)
Ответы Re: our buffer replacement strategy is kind of lame  (Simon Riggs <simon@2ndQuadrant.com>)
Список pgsql-hackers
On Fri, Aug 12, 2011 at 4:36 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Fri, Aug 12, 2011 at 5:05 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>
>> On
>> the other hand, the buffer manager has *no problem at all* trashing
>> the buffer arena if we're faulting in pages for an index scan rather
>> than a sequential scan.  If you manage to get all of sample_data into
>> memory (by running many copies of the above query in parallel, you can
>> get each one to allocate its own ring buffer, and eventually pull in
>> all the pages), and then run some query that probes an index which is
>> too large to fit in shared_buffers, it cheerfully blows the whole
>> sample_data table out without a second thought.  Had you sequentially
>> scanned a big table, of course, there would be some protection, but an
>> index scan can stomp all over everything with complete impunity.
>
> That's a good observation and I think we should do this
>
> * Make an IndexScan use a ring buffer once it has used 32 blocks. The
> vast majority won't do that, so we avoid overhead on the common path.
>
> * Make an BitmapIndexScan use a ring buffer when we know that the
> index is larger than 32 blocks. (Ignore upper parts of tree for that
> calc).

We'd need to think about what happens to the internal pages of the
btree, the leaf pages, and then the heap pages from the underlying
relation; those probably shouldn't all be treated the same.  Also, I
think the tricky part is figuring out when to apply the optimization
in the first place.  Once we decide we need a ring buffer, a very
small one (like 32 blocks) is probably the way to go.  But it will be
a loser to apply the optimization to data sets that would otherwise
have fit in shared_buffers.  This is a classic case of the LRU/MRU
problem.  You want to evict buffers in LRU fashion until the working
set gets larger than you can cache; and then you want to switch to MRU
to avoid uselessly caching pages that you'll never manage to revisit
before they're evicted.

The point of algorithms like Clock-Pro is to try to have the system
work that out on the fly, based on the actual workload, rather than
using heuristics.  I agree with you there's no convincing evidence
that Clock-Pro would be better for us; I mostly thought it was
interesting because it seems that the NetBSD and Linux guys find it
interesting, and they're managing much larger caches than the ones
we're dealing with.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: our buffer replacement strategy is kind of lame
Следующее
От: Simon Riggs
Дата:
Сообщение: Re: our buffer replacement strategy is kind of lame