On Fri, Aug 12, 2011 at 8:28 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Fri, Aug 12, 2011 at 1:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Aug 12, 2011 at 4:33 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>>> You're missing an important point. The SeqScan is measurably faster
>>> when using the ring buffer because of the effects of L2 cacheing on
>>> the buffers.
>>
>> I hadn't thought of that, but I think that's only true if the relation
>> won't fit in shared_buffers (or whatever portion of shared_buffers is
>> reasonably available, given the other activity on the machine). In
>> this particular case, it's almost 20% faster if the relation is all in
>> shared_buffers; I tested it. I think what's going on here is that
>> initscan() has a heuristic that tries to use a BufferAccessStrategy if
>> the relation is larger than 1/4 of shared_buffers. That's probably a
>> pretty good heuristic in general, but in this case I have a relation
>> which just so happens to be 27.9% of shared_buffers but will still
>> fit. As you say below, that may not be typical in real life, although
>> there are probably data warehousing systems where it's normal to have
>> only one big query running at a time.
>
> I think there are reasonable arguments to make
>
> * prefer_cache = off (default) | on a table level storage parameter,
> =on will disable the use of BufferAccessStrategy
>
> * make cache_spoil_threshold a parameter, with default 0.25
Yeah, something like that might make sense. Of course, a completely
self-tuning system would be better, but I'm not sure we're going to
find one of those.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company