Re: index prefetching

Поиск
Список
Период
Сортировка
От Konstantin Knizhnik
Тема Re: index prefetching
Дата
Msg-id d9f520be-ca8e-47fc-a56c-55c572f6c9a9@garret.ru
обсуждение исходный текст
Ответ на Re: index prefetching  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers


On 17/12/2025 8:49 PM, Peter Geoghegan wrote:
On Wed, Dec 17, 2025 at 12:19 PM Konstantin Knizhnik <knizhnik@garret.ru> wrote:
create table t (pk integer primary key, payload text default repeat('x',
1000)) with (fillfactor=10);
insert into t values (generate_series(1,10000000))

So it creates table with size 80Gb (160 after vacuum) which doesn't fit
in RAM.
160 after VACUUM? What do you mean?

Sorry, it was my mistake. Now relation to vacuum.
As you can see with specified fillfactor and filler field there is exactly one record per page. So table size should be ~80Gb.
But when I did `select pg_relation_size('t') I saw 160Gb. It was because my first attempt to upload populate this relation was canceled.
For some reasons I thought that fiel will be just truncated in this case. But it is not and actually it doubles size of the relation.
But it should not affect index scan speed.


but what confuses me is that they do not depend on
`effective_io_concurrency`.
You did change other settings, right? You didn't just use the default
shared_buffers, for example? (Sorry, I have to ask.)

No, I have not changed default value of shared buffers (128Mb).
It should be enough to provide enough free buffers for stream io to use prefetch.

Moreover with `enable_indexscan_prefetch=off` results are the same.
It's quite unlikely that the current heuristics that trigger
prefetching would have ever allowed any prefetching, for queries such
as these.

The exact rule right now is that we don't even begin prefetching until
we've already read at least one index leaf page, and have to read
another one. So it's impossible to use prefetching with a LIMIT of 1,
with queries such as these. It's highly unlikely that you'd see any
benefits from prefetching even with LIMIT 100 (usually we wouldn't
even begin prefetching).

I have checked in debugger that prefetching is actually performed: 
xs_heapfetch is initialized and its prefetch distance is increased (to 32).

I could definitely believe that the new amgetbatch interface is
noticeably faster with range queries. Maybe 5% - 10% faster (even
without using the heap-buffer-locking optimization we've talked about
on this thread, which you can't have used here because I haven't
posted it to the list just yet). But a near 2x improvement wildly
exceeds my expectations. Honestly, I have no idea why the patch is so
much faster, and suspect an invalid result.

It might make sense for you to try it again with just the first patch
applied (the patch that adds the basic table AM and index AM interface
revisions, and makes nbtree supply its own amgetbatch/replaces
btgetbatch with btgettuple). I suppose it's possible that Andres'
patch 0004 somehow played some role here, since that is independently
useful work (I don't quite recall the details of where else that might
be useful right now). But that's just a wild guess.

I will try to find out the reason, that you for suggestion.

В списке pgsql-hackers по дате отправления: