2011/1/19 Bruce Momjian <bruce@momjian.us>:
> Robert Haas wrote:
>> On Thu, Nov 11, 2010 at 2:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > Robert Haas <robertmhaas@gmail.com> writes:
>> >> Yeah. ?For Kevin's case, it seems like we want the caching percentage
>> >> to vary not so much based on which table we're hitting at the moment
>> >> but on how much of it we're actually reading.
>> >
>> > Well, we could certainly take the expected number of pages to read and
>> > compare that to effective_cache_size. ?The thing that's missing in that
>> > equation is how much other stuff is competing for cache space. ?I've
>> > tried to avoid having the planner need to know the total size of the
>> > database cluster, but it's kind of hard to avoid that if you want to
>> > model this honestly.
>>
>> I'm not sure I agree with that. I mean, you could easily have a
>> database that is much larger than effective_cache_size, but only that
>> much of it is hot. Or, the hot portion could move around over time.
>> And for reasons of both technical complexity and plan stability, I
>> don't think we want to try to model that. It seems perfectly
>> reasonable to say that reading 25% of effective_cache_size will be
>> more expensive *per-page* than reading 5% of effective_cache_size,
>> independently of what the total cluster size is.
>
> Late reply, but one idea is to have the executor store hit counts for
> later use by the optimizer. Only the executor knows how many pages it
> had to request from the kernel for a query. Perhaps getrusage could
> tell us how often we hit the disk.
AFAIK getrusage does not provide access to real IO counters but
filesystem's ones. :-(
--
Cédric Villemain 2ndQuadrant
http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support