Re: index prefetching
От | Peter Geoghegan |
---|---|
Тема | Re: index prefetching |
Дата | |
Msg-id | CAH2-WznFdjY_OB2S7_BY4iAyeffK+XrE2qsX6aghgP63VocRfQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: index prefetching (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: index prefetching
|
Список | pgsql-hackers |
On Wed, Sep 3, 2025 at 2:47 PM Andres Freund <andres@anarazel.de> wrote: > I still don't think I fully understand why the impact of this is so large. The > branch misses appear to be the only thing differentiating the two cases, but > with resowners neutralized, the remaining difference in branch misses seems > too large - it's not like the sequence of block numbers is more predictable > without prefetching... > > The main increase in branch misses is in index_scan_stream_read_next... I've been working on fixing the same regressed query, but using a completely different (though likely complementary) approach: by adding a test to index_scan_stream_read_next that detects when prefetching isn't favorable. If it isn't favorable, then we stop prefetching entirely (we fall back on regular sync I/O). Although this experimental approach is still very rough, it seems promising. It ~100% fixes the problem at hand, without really creating any new problems (at least as far as our testing has been able to determine, so far). The key idea is to wait until a few batches have already been read, and then test whether the index-tuple-wise "distance" between readPos (the read position) and streamPos (the stream position used by index_scan_stream_read_next) remained excessively low within index_scan_stream_read_next. If, after processing 20 batches/leaf pages, readPos and streamPos still read from the same batch *and* have a low index-tuple-wise position within that batch (they're within 10 or 20 items of each other), we expect "thrashing", which makes prefetching unfavorable -- and so we just stop using our read stream. It's worth noting that (given the current structure of the patch) it is inherently impossible to do something like this from within the read stream. We're suppressing duplicate heap block requests iff the blocks are contiguous within the index. So read stream just doesn't see anything like what I'm calling the "index-tuple-wise distance" between readPos and streamPos. Note that the baseline behavior for the test case (the behavior with master, or with prefetching disabled) appears to be very I/O bound, due to readahead. I've confirmed this using iostat. So "synchronous" I/O isn't very synchronous here. (Prefetching actually does make sense when this query is run with direct I/O, but that's far slower with or without the use of explicit prefetching, so that likely doesn't tell us much.) -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: