Re: index prefetching
От | Peter Geoghegan |
---|---|
Тема | Re: index prefetching |
Дата | |
Msg-id | CAH2-WzkC9yr-28y_O4jYb1SMGsihcVBf11cs9b8eo8UgqTLFsw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: index prefetching ("Peter Geoghegan" <pg@bowt.ie>) |
Список | pgsql-hackers |
On Tue, Aug 12, 2025 at 5:22 PM Peter Geoghegan <pg@bowt.ie> wrote: > There does seem to be something fishy going on with the patch here. I can see > strange inconsistencies in EXPLAIN ANALYZE output when the server is started > with --debug_io_direct=data with the master, compared to what I see with the > patch. Attached is my working version of the patch, in case that helps anyone with reproducing the problem. Note that the nbtree changes are now included in this one patch/commit. Definitely might make sense to revert to one patch per index AM again later, but for now it's convenient to have one commit that both adds the concept of amgetbatch, and removes nbtree's btgettuple (since it bleeds into things like how indexam.c wants to do mark and restore). There are only fairly minor changes here. Most notably: * Generalizes nbtree's _bt_drop_lock_and_maybe_pin, making it an index-AM-generic thing I call index_batch_unlock. Previous versions of this complex patch avoided the issue by always holding on to a leaf page buffer pin, even when it wasn't truly necessary (i.e. with plain index scans that use an MVCC snapshot). It shouldn't be too hard to teach GiST to use index_batch_unlock to continue dropping buffer pins on leaf pages, as before (with gistgettuple). The hard part will be ordered GiST scans, and perhaps every kind of GiST index-only scan (since in general index-only scans cannot drop pins eagerly within index_batch_unlock, due to race conditions with VACUUM concurrently setting VM bits all-visible). * Replaces BufferMatches() with something a bit less invasive, which works based on block numbers (not buffers). * Various refinements to the way that nbtree deals with setting things up using an existing batch. In particular, the interface of _bt_readnextpage has been revised. It now makes much more sense in a world where nbtree doesn't "own" existing batches -- we no longer directly pass an existing batch to _bt_readnextpage, and it no longer thinks it can clobber what is actually an old batch. -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: