Re: Clock sweep not caching enough B-Tree leaf pages?

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: Clock sweep not caching enough B-Tree leaf pages?
Дата
Msg-id CAM3SWZQSvNzwWvXLvdtmh_GOasXcQKRwVURpPBS7eEr0Jztjrw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Clock sweep not caching enough B-Tree leaf pages?  (Greg Stark <stark@mit.edu>)
Ответы Re: Clock sweep not caching enough B-Tree leaf pages?  (Andres Freund <andres@2ndquadrant.com>)
Re: Clock sweep not caching enough B-Tree leaf pages?  (Peter Geoghegan <pg@heroku.com>)
Список pgsql-hackers
On Thu, Apr 17, 2014 at 8:10 AM, Greg Stark <stark@mit.edu> wrote:
> I don't think "common sense" is compelling. I think you need to pin
> down exactly what it is about btree intermediate pages that the LRU
> isn't capturing and not just argue they're more useful. The LRU is
> already capturing which pages are more heavily used than others so you
> need to identify what it is that makes index pages *even more* useful
> than their frequency and recency of access indicates. Not just that
> they're more useful than an average page.

See example 1.1 within the LRU-K paper.

> So what I think is missing is that indexes are always accessed from
> the root down to the leaf. So the most recent page accessed will
> always be the leaf. And in whatever chain of pages was used to reach
> the last leaf page the least recently accessed will always be the
> root. But we'll need the root page again on the subsequent descent
> even if it's to reach the same leaf page we kept in ram in preference
> to it.

I can't imagine that this is much of a problem in practice. Consider
the break-down of pages within indexes when pgbench scale is 5,000, as
in my original benchmark:

[local] pg@pgbench=# with tots as (
SELECT count(*) c, type, relname from       (select relname, relpages, generate_series(1, relpages - 1) i
from pg_class c join pg_namespace n on c.relnamespace = n.oid where
relkind = 'i' and nspname = 'public') r,       lateral (select * from bt_page_stats(relname, i)) u
group by relname, type)
select tots.relname, relpages -1 as non_meta_pages, c, c/sum(c)
over(partition by tots.relname) as prop_of_index, type from tots join
pg_class c on c.relname = tots.relname order by 2 desc, 1, type;
       relname        | non_meta_pages |    c    |
prop_of_index        | type
-----------------------+----------------+---------+----------------------------+------pgbench_accounts_pkey |
1370950|    4828 |
 
0.00352164557423684307 | ipgbench_accounts_pkey |        1370950 | 1366121 |
0.99647762500455888253 | lpgbench_accounts_pkey |        1370950 |       1 |
0.000000729421204274408257 | rpgbench_tellers_pkey  |            274 |     273 |
0.99635036496350364964 | lpgbench_tellers_pkey  |            274 |       1 |
0.00364963503649635036 | rpgbench_branches_pkey |             28 |      27 |
0.96428571428571428571 | lpgbench_branches_pkey |             28 |       1 |
0.03571428571428571429 | r
(7 rows)

Time: 14562.297 ms

Just over 99.6% of pages (leaving aside the meta page) in the big 10
GB pgbench_accounts_pkey index are leaf pages. The inner pages and
root page are at an enormous advantage. In this example, the other
indexes don't even have what would be separately classified as an
inner page (and not a root page) at all, because it's perfectly
sufficient to only have a root page to get to any one of, say, 273
leaf pages (in the case of pgbench_tellers_pkey here).

-- 
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Steve Singer
Дата:
Сообщение: assertion in 9.4 with wal_level=logical
Следующее
От: Andres Freund
Дата:
Сообщение: Re: assertion in 9.4 with wal_level=logical