On Fri, Aug 12, 2011 at 1:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Aug 12, 2011 at 4:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> You're missing an important point. The SeqScan is measurably faster
>> when using the ring buffer because of the effects of L2 cacheing on
>> the buffers.
>
> I hadn't thought of that, but I think that's only true if the relation
> won't fit in shared_buffers (or whatever portion of shared_buffers is
> reasonably available, given the other activity on the machine). In
> this particular case, it's almost 20% faster if the relation is all in
> shared_buffers; I tested it. I think what's going on here is that
> initscan() has a heuristic that tries to use a BufferAccessStrategy if
> the relation is larger than 1/4 of shared_buffers. That's probably a
> pretty good heuristic in general, but in this case I have a relation
> which just so happens to be 27.9% of shared_buffers but will still
> fit. As you say below, that may not be typical in real life, although
> there are probably data warehousing systems where it's normal to have
> only one big query running at a time.
I think there are reasonable arguments to make
* prefer_cache = off (default) | on a table level storage parameter,
=on will disable the use of BufferAccessStrategy
* make cache_spoil_threshold a parameter, with default 0.25
Considering the world of very large RAMs in which we now live, some
control of the above makes sense.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services