On Mon, Apr 20, 2015 at 6:13 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Bruce Momjian wrote:
>> On Mon, Apr 20, 2015 at 04:19:22PM -0300, Alvaro Herrera wrote:
>> > Bruce Momjian wrote:
>> > This seems simple to implement: keep two counters, where the second one
>> > is pages we skipped cleanup in. Once that counter hits SOME_MAX_VALUE,
>> > reset the first counter so that further 5 pages will get HOT pruned. 5%
>> > seems a bit high though. (In Simon's design, SOME_MAX_VALUE is
>> > essentially +infinity.)
>>
>> This would tend to dirty non-sequential heap pages --- it seems best to
>> just clean as many as we are supposed to, then skip the rest, so we can
>> write sequential dirty pages to storage.
>
> Keep in mind there's a disconnect between dirtying a page and writing it
> to storage. A page could remain dirty for a long time in the buffer
> cache. This writing of sequential pages would occur at checkpoint time
> only, which seems the wrong thing to optimize. If some other process
> needs to evict pages to make room to read some other page in, surely
> it's going to try one page at a time, not write "many sequential dirty
> pages."
Well, for a big sequential scan, we use a ring buffer, so we will
typically be evicting the pages that we ourselves read in moments
before. So in this case we would do a lot of sequential writes of
dirty pages.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company