Re: index prefetching

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: index prefetching
Дата	15 августа 02:21:15
Msg-id	602e0029-1619-4288-acec-8a38e6fb9e6a@vondra.me обсуждение исходный текст
Ответ на	Re: index prefetching (Peter Geoghegan <pg@bowt.ie>)
Ответы	Re: index prefetching
Список	pgsql-hackers

Дерево обсуждения

On 8/14/25 23:55, Peter Geoghegan wrote:
> On Thu, Aug 14, 2025 at 5:06 PM Peter Geoghegan <pg@bowt.ie> wrote:
>> If this same mechanism remembered (say) the last 2 heap blocks it
>> requested, that might be enough to totally fix this particular
>> problem. This isn't a serious proposal, but it'll be simple enough to
>> implement. Hopefully when I do that (which I plan to soon) it'll fully
>> validate your theory.
> 
> I spoke too soon. It isn't going to be so easy, since
> heapam_index_fetch_tuple wants to consume buffers as a simple stream.
> There's no way that index_scan_stream_read_next can just suppress
> duplicate block number requests (in a way that's more sophisticated
> than the current trivial approach that stores the very last block
> number in IndexScanBatchState.lastBlock) without it breaking the whole
> concept of a stream of buffers.
> 

I believe this idea (checking not just the very last block, but keeping
a bit longer history) was briefly discussed a couple months ago, after
you pointed out the need for the "last block" optimization (which the
patch didn't have). At that point we were focused on addressing a
regression with correlated indexes, so the single block was enough.

But as you point out, it's harder than it seems. If I recall correctly,
the challenge is that heapam_index_fetch_tuple() is expected to release
the block when it changes, but then how would it know there's no future
read of the same buffer in the stream?

>>> We can optimize that by deferring the StartBufferIO() if we're encountering a
>>> buffer that is undergoing IO, at the cost of some complexity.  I'm not sure
>>> real-world queries will often encounter the pattern of the same block being
>>> read in by a read stream multiple times in close proximity sufficiently often
>>> to make that worth it.
>>
>> We definitely need to be prepared for duplicate prefetch requests in
>> the context of index scans.
> 
> Can you (or anybody else) think of a quick and dirty way of working
> around the problem on the read stream side? I would like to prioritize
> getting the patch into a state where its overall performance profile
> "feels right". From there we can iterate on fixing the underlying
> issues in more principled ways.
> 
> FWIW it wouldn't be that hard to require the callback (in our case
> index_scan_stream_read_next) to explicitly point out that it knows
> that the block number it's requesting has to be a duplicate. It might
> make sense to at least place that much of the burden on the
> callback/client side.
> 

I don't recall all the details, but IIRC my impression was it'd be best
to do this "caching" entirely in the read_stream.c (so the next_block
callbacks would probably not need to worry about lastBlock at all),
enabled when creating the stream. And then there would be something like
read_stream_release_buffer() that'd do the right to release the buffer
when it's not needed.

regards

-- 
Tomas Vondra

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: index prefetching