Re: index prefetching

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: index prefetching
Дата	14 августа 23:44:14
Msg-id	kvyser45imw3xmisfvpeoshisswazlzw35el3fq5zg73zblpql@f56enfj45nf7 обсуждение исходный текст
Ответ на	Re: index prefetching (Peter Geoghegan <pg@bowt.ie>)
Ответы	Re: index prefetching
Список	pgsql-hackers

Дерево обсуждения

Hi,

On 2025-08-14 15:45:26 -0400, Peter Geoghegan wrote:
> On Thu, Aug 14, 2025 at 3:15 PM Peter Geoghegan <pg@bowt.ie> wrote:
> > Then why does the exact same pair of runs show "I/O Timings: shared
> > read=194.629" for the sequential table backwards scan (with total
> > execution time 1132.360 ms), versus "I/O Timings: shared read=352.88"
> > (with total execution time 697.681 ms) for the random table backwards
> > scan?
>
> If you're interested in trying this out for yourself, I've pushed my
> working branch here:
>
> https://github.com/petergeoghegan/postgres/tree/index-prefetch-batch-v1.2
>
> Note that the test case you'll run is added by the most recent commit:
>
> https://github.com/petergeoghegan/postgres/commit/c9ceb765f3b138f53b7f1fdf494ba7c816082aa1
>
> Run microbenchmarks/random_backwards_weird.sql to do an initial load
> of both of the tables. Then run
> microbenchmarks/queries_random_backwards_weird.sql to actually run the
> relevant queries. There are 4 such queries, but only the 2 backwards
> scan queries really seem relevant.

Interesting. In the sequential case I see some waits that are not attributed
in explain, due to the waits happening within WaitIO(), not WaitReadBuffers().
Which indicates that the read stream is trying to re-read a buffer that
previously started being read.

   read_stream_start_pending_read()
-> StartReadBuffers()
-> AsyncReadBuffers()
-> ReadBuffersCanStartIO()
-> StartBufferIO()
-> WaitIO()

There are far fewer cases of this in the random case.

From what I can tell the sequential case so often will re-read a buffer that
it is already in the process of reading - and thus wait for that IO before
continuing - that we don't actually keep enough IO in flight.

In your email with iostat output you can see that the slow case has
aqu-sz=5.18, while the fast case has aqu-sz=10.06, i.e. the fast case has
twice as much IO in flight. While both have IOs take the same amount of time
(r_await=0.20). Which certainly explains the performance difference...

We can optimize that by deferring the StartBufferIO() if we're encountering a
buffer that is undergoing IO, at the cost of some complexity.  I'm not sure
real-world queries will often encounter the pattern of the same block being
read in by a read stream multiple times in close proximity sufficiently often
to make that worth it.

Greetings,

Andres Freund

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: index prefetching