Обсуждение: Metadata and record block access stats for indexes

Поиск
Список
Период
Сортировка

Metadata and record block access stats for indexes

От
Mircea Cadariu
Дата:
Hi,

For the purpose of writing a blog post I was checking the index stats 
recorded for a workload, but found them rather confusing. Following 
along the code with the debugger it eventually made sense, and I could 
eventually understand what's counted.  Looking around a bit, I 
discovered an older discussion [1] in the mailing lists and learned that 
the issue is known.  The proposal in that thread is to start counting 
separate metadata and record stats depending on what type of index block 
is retrieved.

I realized those would have helped me better understand the collected 
index stats, so I started working on a patch to add these in the system 
views. Attached is a WIP patch file with partial coverage of the B-Tree 
index code. The implementation follows the existing stats collection 
approach and the naming convention proposed in [1].  Let me know if what 
I'm doing is feasible and if there's any concerns I could address. Next 
steps would be to replace all places where I currently pass in NULL with 
proper counting, as well as update tests and docs.

Looking forward to your feedback! Thanks!

Cheers,
Mircea

[1]: 
https://www.postgresql.org/message-id/flat/CAH2-WzmdZqxCS1widYzjDAM%2BZ-Jz%3DejJoaWXDVw9Qy1UsK0tLA%40mail.gmail.com 


Вложения

Re: Metadata and record block access stats for indexes

От
Mircea Cadariu
Дата:
Hi,

Just attaching the complete patch which now covers all index types, docs 
and tests.

You can run the following to see it in action:

create table test (id serial primary key);
insert into test select * from generate_series(0,30000);
select pg_stat_reset();
select * from test where id=3000;
select * from pg_statio_all_indexes where indexrelname = 'test_pkey';

This will show that there were 2 index blocks read from shared buffers 
(hit): 1 metadata and one record.

Cheers,

Mircea

On 28/02/2025 21:58, Mircea Cadariu wrote:
> Hi,
>
> For the purpose of writing a blog post I was checking the index stats 
> recorded for a workload, but found them rather confusing. Following 
> along the code with the debugger it eventually made sense, and I could 
> eventually understand what's counted.  Looking around a bit, I 
> discovered an older discussion [1] in the mailing lists and learned 
> that the issue is known.  The proposal in that thread is to start 
> counting separate metadata and record stats depending on what type of 
> index block is retrieved.
>
> I realized those would have helped me better understand the collected 
> index stats, so I started working on a patch to add these in the 
> system views. Attached is a WIP patch file with partial coverage of 
> the B-Tree index code. The implementation follows the existing stats 
> collection approach and the naming convention proposed in [1].  Let me 
> know if what I'm doing is feasible and if there's any concerns I could 
> address. Next steps would be to replace all places where I currently 
> pass in NULL with proper counting, as well as update tests and docs.
>
> Looking forward to your feedback! Thanks!
>
> Cheers,
> Mircea
>
> [1]: 
> https://www.postgresql.org/message-id/flat/CAH2-WzmdZqxCS1widYzjDAM%2BZ-Jz%3DejJoaWXDVw9Qy1UsK0tLA%40mail.gmail.com 
>
Вложения

Re: Metadata and record block access stats for indexes

От
Frédéric Yhuel
Дата:

On 7/4/25 18:00, Mircea Cadariu wrote:
> Just attaching v2 of the patch.

Hi Mircea,

Your patch applies cleanly and seems to work well.

IIUC, the index hit ratio should be computed with the following formula:

(idx_blks_hit - idx_metadata_blks) / (idx_blks_hit - idx_metadata_blks + 
idx_blks_read)

because most of the index non-leaf pages should be in the cache. Right?

This should probably be documented somewhere?

Here is my testing:

  # select tree_level, internal_pages, leaf_pages from 
pgstatindex('pgbench_accounts_pkey');
  tree_level | internal_pages | leaf_pages
------------+----------------+------------
           2 |             20 |       5465
(1 row)

  # SELECT DISTINCT pg_buffercache_evict(bufferid)
   FROM pg_buffercache
  WHERE relfilenode = pg_relation_filenode('pgbench_accounts_pkey');
  pg_buffercache_evict
----------------------
  (t,f)
(1 row)

  # SELECT pg_stat_reset();
  pg_stat_reset
---------------

(1 row)

  # SELECT max(abalance) FROM pgbench_accounts WHERE aid = 100;
  max
-----
    0
(1 row)

  # select idx_blks_read, idx_blks_hit, idx_metadata_blks from 
pg_statio_all_indexes where indexrelname = 'pgbench_accounts_pkey';
  idx_blks_read | idx_blks_hit | idx_metadata_blks
---------------+--------------+-------------------
              3 |            0 |                 2
(1 row)

--> 3 pages: the root of the tree, one internal page and one leaf

  #
\q
fyhuel@framework:~$ psql bench
psql (19devel)
Type "help" for help.

  # SELECT max(abalance) FROM pgbench_accounts WHERE aid = 100;
  max
-----
    0
(1 row)

primary sleaf bench [42323] # select idx_blks_read, idx_blks_hit, 
idx_metadata_blks from pg_statio_all_indexes where indexrelname = 
'pgbench_accounts_pkey';
  idx_blks_read | idx_blks_hit | idx_metadata_blks
---------------+--------------+-------------------
              4 |            3 |                 5

--> 4 more pages: same as before, already in cache, plus the index meta 
page, read outside shared buffers because we started a new session?




Re: Metadata and record block access stats for indexes

От
Mircea Cadariu
Дата:
Hi Frédéric,

Thanks a lot for trying out my (first) patch! Much appreciated.


On 20/07/2025 21:54, Frédéric Yhuel wrote:
> Your patch applies cleanly and seems to work well.

Cool!


> because most of the index non-leaf pages should be in the cache. Right?

Yes indeed, it's an assumption in the implementation, that the non-leaf 
pages will roughly always be in the cache.


> This should probably be documented somewhere?

I'm still familiarising myself about what to document where, whether 
things should be in the official docs or separate blog posts. In the 
patch I only documented the new column next to the existing ones for now.


> --> 3 pages: the root of the tree, one internal page and one leaf

Yes, this is correct.


> primary sleaf bench [42323] # select idx_blks_read, idx_blks_hit, 
> idx_metadata_blks from pg_statio_all_indexes where indexrelname = 
> 'pgbench_accounts_pkey';
>  idx_blks_read | idx_blks_hit | idx_metadata_blks
> ---------------+--------------+-------------------
>              4 |            3 |                 5
>
> --> 4 more pages: same as before, already in cache, plus the index 
> meta page, read outside shared buffers because we started a new session?

Yes, that's my understanding too.


Thanks!


Kind regards,

Mircea Cadariu





Re: Metadata and record block access stats for indexes

От
Mircea Cadariu
Дата:
Rebased and dusted off this patch.
Вложения

Re: Metadata and record block access stats for indexes

От
Aleksander Alekseev
Дата:
Hi Mircea,

> Rebased and dusted off this patch.

Thanks for the patch. Here are my two cents.

IMO it would be helpful if you could come up with a few more practical
use cases. This change is going to affect pretty much everyone. If
only a few users will benefit from it once in several years, the value
of the patch is arguably low. As an example, can you think of how the
new counters can be used for debugging, checking index integrity,
writing new access methods or perhaps writing property-based tests?
Just several examples that came to my mind first.

Also I'm a bit concerned about the performance impact. It's probably
next to nothing, but if you could measure it on a relatively large
amount of data that would be great. Note that it's not uncommon to
have dozens of different indexes for a single table.

-- 
Best regards,
Aleksander Alekseev