Re: Clock sweep not caching enough B-Tree leaf pages?
От | Bruce Momjian |
---|---|
Тема | Re: Clock sweep not caching enough B-Tree leaf pages? |
Дата | |
Msg-id | 20140417144814.GB7443@momjian.us обсуждение исходный текст |
Ответ на | Re: Clock sweep not caching enough B-Tree leaf pages? (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Clock sweep not caching enough B-Tree leaf pages?
(Robert Haas <robertmhaas@gmail.com>)
Re: Clock sweep not caching enough B-Tree leaf pages? (Andres Freund <andres@2ndquadrant.com>) |
Список | pgsql-hackers |
On Thu, Apr 17, 2014 at 10:40:40AM -0400, Robert Haas wrote: > On Thu, Apr 17, 2014 at 10:32 AM, Bruce Momjian <bruce@momjian.us> wrote: > > On Thu, Apr 17, 2014 at 10:18:43AM -0400, Robert Haas wrote: > >> I also believe this to be the case on first principles and my own > >> experiments. Suppose you have a workload that fits inside > >> shared_buffers. All of the usage counts will converge to 5. Then, > >> somebody accesses a table that is not cached, so something's got to be > >> evicted. Because all the usage counts are the same, the eviction at > >> this point is completely indiscriminate. We're just as likely to kick > >> out a btree root page or a visibility map page as we are to kick out a > >> random heap page, even though the former have probably been accessed > >> several orders of magnitude more often. That's clearly bad. On > >> systems that are not too heavily loaded it doesn't matter too much > >> because we just fault the page right back in from the OS pagecache. > >> But I've done pgbench runs where such decisions lead to long stalls, > >> because the page has to be brought back in from disk, and there's a > >> long I/O queue; or maybe just because the kernel thinks PostgreSQL is > >> issuing too many I/O requests and makes some of them wait to cool > >> things down. > > > > I understand now. If there is no memory pressure, every buffer gets the > > max usage count, and when a new buffer comes in, it isn't the max so it > > is swiftly removed until the clock sweep has time to decrement the old > > buffers. Decaying buffers when there is no memory pressure creates > > additional overhead and gets into timing issues of when to decay. > > That can happen, but the real problem I was trying to get at is that > when all the buffers get up to max usage count, they all appear > equally important. But in reality they're not. So when we do start > evicting those long-resident buffers, it's essentially random which > one we kick out. True. Ideally we would have some way to know that _all_ the buffers had reached the maximum and kick off a sweep to decrement them all. I am unclear how we would do that. One odd idea would be to have a global counter that is incremented everytime a buffer goes from 4 to 5 (max) --- when the counter equals 50% of all buffers, do a clock sweep. Of course, then the counter becomes a bottleneck. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
В списке pgsql-hackers по дате отправления: