Re: index prefetching
| От | Peter Geoghegan |
|---|---|
| Тема | Re: index prefetching |
| Дата | |
| Msg-id | CAH2-WzkMn=P5HoA6mLq2hdvj5cTa0DYKiyn6HSLcqXDDdieaHA@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: index prefetching (Amit Langote <amitlangote09@gmail.com>) |
| Список | pgsql-hackers |
Hi Amit, On Thu, Dec 4, 2025 at 12:54 AM Amit Langote <amitlangote09@gmail.com> wrote: > I want to acknowledge that figuring out the right layering to make I/O > prefetching and perhaps other optimizations internal to IndexNext() > work is obviously the priority right now, regardless of the output > format used to populate the slots ultimately returned by > table_index_getnext_slot(). Right; table_index_getnext_slot simply returns a tuple into the caller's slot. That's almost the same as the existing getnext_slot interface used by those same call sites on the master branch, except that in the patch we're directly calling a table AM callback/heapam specific implementation (not code in indexam.c). The new heapam implementation heapam_index_getnext_slot applies more high-level context about ordered index scans, which enables it to reorder work quite freely, even when it is work that takes place in index AMs. > However, regarding your question about > "painting ourselves into a corner": > > In my executor batching work (which has focused on Seq Scans), the > HeapBatch is essentially just a pinned buffer plus an array of > pre-allocated tuple headers. I hadn't strictly considered creating a > HeapBatch to return from Index Scans, largely because > heap_hot_search_buffer() is designed for scalar (or non-batched) > access that requires repeated buffer locking. > > But it seems like the eventual goal of batching calls to > heap_hot_search_buffer() effectively clears that hurdle. Actually, that's not the eventual goal anymore; now we're treating it as our *immediate* goal, at least in terms of things that will have user-visible impact (as opposed to API changes needed to facilitate batching type optimizations in the future, including I/O prefetching). It's not completely clear if prefetching is off the table for Postgres 19, but it certainly seems optimistic at this point. But the heap_hot_search_buffer thing definitely is in scope for Postgres 19 (if we're going to make all these API changes then it seems best to give users an immediate benefit). > As long as > the internal logic separates the "grouping/locking" from the > "materializing into a slot," it seems this design does not prevent us > from eventually wiring up a table_index_getnext_batch() to populate > the HeapBatch structure I am proposing for the regular non-index scan > path (table_scan_getnextbatch() in my patch). That's good. Suppose we do a much more advanced version of the kind of work reordering that the heap_hot_search_buffer thing will do for Postgres 19. I described this to Tomas in my last email to this thread, when I said: """ We could even do something much more sophisticated than what I actually have planned for 19: we could reorder table fetches, such that we only had to lock and pin each heap page exactly once *even when the TIDs returned by the index scan return TIDs slightly out of order*. For example, if an index page/batch returns TIDs "(1,1), (2,1), (1,2), (1,3), (2,2)", we could get all tuples for heap blocks 1 and 2 by locking and pinning each of those 2 pages exactly once. The only downside (other than the complexity) is that we'd sometimes hold multiple heap page pins at a time, not just one. """ (To be clear this more advanced version is definitely out of scope for Postgres 19.) We'd be holding on to multiple buffer pins at a time (across calls to heapam_index_getnext_slot) were we to do this more advanced optimization. I *think* that still means that the design/internal logic will (as you put it) "separate the 'grouping/locking' from the 'materializing into a slot'". That's just the only way that could possibly work correctly, at least with heapam. It makes sense for us both to (at a minimum) have at least some general awareness of each other's goals. I really only want to avoid completely gratuitous incompatibilities/conflicts. For example, if you invent a new slot-like mechanism in the executor that can return multiple tuples in one go, then it seems like we should probably try to use that in our own work on batching. If we're already assembling the information in a way that almost works with that new interface, why wouldn't we make sure that it actually worked with and used that new interface directly? It doesn't sound like there'd be many disagreements on how that would have to work, since the requirements are largely dictated by existing constraints that we're both already naturally subject to. For example: * We need to hold on to a buffer pin on a heap page if one of its heap tuples is contained in a slot/something slot-like. For as long as there's any chance that somebody will examine that heap tuple (until the slot releases the tuple). * Buffer locks must only be acquired by lower-level access method code, for very short periods, and never in a way that requires coordination across module boundaries. It sounds like the potential for conflicts between each other's work will be absolutely minimal. It seems as if we don't even have to agree on anything new or novel. > Sorry to hijack the thread, but just wanted to confirm I haven't > misunderstood the architectural implications for future batching. I don't think that you've hijacked anything. Your input is more than welcome. -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: