Re: index prefetching
От | Andres Freund |
---|---|
Тема | Re: index prefetching |
Дата | |
Msg-id | kvyser45imw3xmisfvpeoshisswazlzw35el3fq5zg73zblpql@f56enfj45nf7 обсуждение исходный текст |
Ответ на | Re: index prefetching (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: index prefetching
|
Список | pgsql-hackers |
Hi, On 2025-08-14 15:45:26 -0400, Peter Geoghegan wrote: > On Thu, Aug 14, 2025 at 3:15 PM Peter Geoghegan <pg@bowt.ie> wrote: > > Then why does the exact same pair of runs show "I/O Timings: shared > > read=194.629" for the sequential table backwards scan (with total > > execution time 1132.360 ms), versus "I/O Timings: shared read=352.88" > > (with total execution time 697.681 ms) for the random table backwards > > scan? > > If you're interested in trying this out for yourself, I've pushed my > working branch here: > > https://github.com/petergeoghegan/postgres/tree/index-prefetch-batch-v1.2 > > Note that the test case you'll run is added by the most recent commit: > > https://github.com/petergeoghegan/postgres/commit/c9ceb765f3b138f53b7f1fdf494ba7c816082aa1 > > Run microbenchmarks/random_backwards_weird.sql to do an initial load > of both of the tables. Then run > microbenchmarks/queries_random_backwards_weird.sql to actually run the > relevant queries. There are 4 such queries, but only the 2 backwards > scan queries really seem relevant. Interesting. In the sequential case I see some waits that are not attributed in explain, due to the waits happening within WaitIO(), not WaitReadBuffers(). Which indicates that the read stream is trying to re-read a buffer that previously started being read. read_stream_start_pending_read() -> StartReadBuffers() -> AsyncReadBuffers() -> ReadBuffersCanStartIO() -> StartBufferIO() -> WaitIO() There are far fewer cases of this in the random case. From what I can tell the sequential case so often will re-read a buffer that it is already in the process of reading - and thus wait for that IO before continuing - that we don't actually keep enough IO in flight. In your email with iostat output you can see that the slow case has aqu-sz=5.18, while the fast case has aqu-sz=10.06, i.e. the fast case has twice as much IO in flight. While both have IOs take the same amount of time (r_await=0.20). Which certainly explains the performance difference... We can optimize that by deferring the StartBufferIO() if we're encountering a buffer that is undergoing IO, at the cost of some complexity. I'm not sure real-world queries will often encounter the pattern of the same block being read in by a read stream multiple times in close proximity sufficiently often to make that worth it. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: