Re: index prefetching
От | Tomas Vondra |
---|---|
Тема | Re: index prefetching |
Дата | |
Msg-id | 602e0029-1619-4288-acec-8a38e6fb9e6a@vondra.me обсуждение исходный текст |
Ответ на | Re: index prefetching (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: index prefetching
|
Список | pgsql-hackers |
On 8/14/25 23:55, Peter Geoghegan wrote: > On Thu, Aug 14, 2025 at 5:06 PM Peter Geoghegan <pg@bowt.ie> wrote: >> If this same mechanism remembered (say) the last 2 heap blocks it >> requested, that might be enough to totally fix this particular >> problem. This isn't a serious proposal, but it'll be simple enough to >> implement. Hopefully when I do that (which I plan to soon) it'll fully >> validate your theory. > > I spoke too soon. It isn't going to be so easy, since > heapam_index_fetch_tuple wants to consume buffers as a simple stream. > There's no way that index_scan_stream_read_next can just suppress > duplicate block number requests (in a way that's more sophisticated > than the current trivial approach that stores the very last block > number in IndexScanBatchState.lastBlock) without it breaking the whole > concept of a stream of buffers. > I believe this idea (checking not just the very last block, but keeping a bit longer history) was briefly discussed a couple months ago, after you pointed out the need for the "last block" optimization (which the patch didn't have). At that point we were focused on addressing a regression with correlated indexes, so the single block was enough. But as you point out, it's harder than it seems. If I recall correctly, the challenge is that heapam_index_fetch_tuple() is expected to release the block when it changes, but then how would it know there's no future read of the same buffer in the stream? >>> We can optimize that by deferring the StartBufferIO() if we're encountering a >>> buffer that is undergoing IO, at the cost of some complexity. I'm not sure >>> real-world queries will often encounter the pattern of the same block being >>> read in by a read stream multiple times in close proximity sufficiently often >>> to make that worth it. >> >> We definitely need to be prepared for duplicate prefetch requests in >> the context of index scans. > > Can you (or anybody else) think of a quick and dirty way of working > around the problem on the read stream side? I would like to prioritize > getting the patch into a state where its overall performance profile > "feels right". From there we can iterate on fixing the underlying > issues in more principled ways. > > FWIW it wouldn't be that hard to require the callback (in our case > index_scan_stream_read_next) to explicitly point out that it knows > that the block number it's requesting has to be a duplicate. It might > make sense to at least place that much of the burden on the > callback/client side. > I don't recall all the details, but IIRC my impression was it'd be best to do this "caching" entirely in the read_stream.c (so the next_block callbacks would probably not need to worry about lastBlock at all), enabled when creating the stream. And then there would be something like read_stream_release_buffer() that'd do the right to release the buffer when it's not needed. regards -- Tomas Vondra
В списке pgsql-hackers по дате отправления: