Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?
| От | Matthias van de Meent |
|---|---|
| Тема | Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM? |
| Дата | |
| Msg-id | CAEze2Wj_9kvgdYUJazbxyTuVLF_2ibZun9H1xULV5bUBXJUNdg@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM? (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
| Ответы |
Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?
|
| Список | pgsql-hackers |
On Thu, 24 Apr 2025 at 22:46, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > On Fri, 21 Mar 2025 at 17:14, Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > Attached is v10, which polishes the previous patches, and adds a patch > > for nbtree to use the new visibility checking strategy so that it too > > can release its index pages much earlier, and adds a similar > > visibility check test to nbtree. > > And here's v12. v11 (skipped) would've been a rebase, but after > finishing the rebase I noticed a severe regression in btree's IOS with > the new code, so v12 here applies some optimizations which reduce the > overhead of the new code. Here's v13, which moves the changes around a bit: v12's 0001 is split into 3 patches (1, 3, and 4), whilst v12's 2-5 were correspondingly renumbered 5-8. Patch 0002 is an otherwise unrelated change in pg_visibility that updates it to use the new vectorized API, reducing overhead. So, summary of the patches: 0001: Replaces visibilitymap_get_status with a vectorized variant that touches each VM page at most once per call; reducing buffer churn and enabling later patches 0002: update pg_visibility to use this newly vectorized API (instead of the current model that checks each page at a time) 0003: Adds the table_index_vischeck_tuples API, requiring a table AM to expose VM checks through an API. 0004: Adjusts Index-Only scan infrastructure to make use of 0004, instead of using VM_ALL_VISIBLE(). It also adds the relevant infrastructure for enabling index AMs to provide the pre-checked visibility status (from table_index_vischeck_tuples) to an index-only scan. 0005: Implement VM-checks in GIST's IOS code 0006: Same, but for SP-GIST 0007: Same, but for NBTREE 0008: Add tests which validate that we still get correct results from our queries, even when we use cursors to block results from getting returned, and cleaning up tuples from those index pages. A big benefit with this patch is that indexes now have no direct reason anymore to hold back a VACUUM scan -- the visibility of tuples can be checked at page scan time, and any shared resources can be released before returning tuples to higher nodes. ------------------------ Summary of the problem that I'm solving here: An index that holds dead tuples could return those dead tuples in an Index-Only Scan (IOS) to the scan node, as the index AM itself doesn't have any information about the visibility of the tuples that it contains. The IOS infrastructure prevents these tuples from being exposed by doing visibility checks against the Visibility Map (VM) and -if necessary- the underlying heap. This, however, depends on an invariant: VACUUM MUST NOT remove a TID that's being returned by an index scan, at least not before before that tuple has been checked for visibility in the VM; otherwise VACUUM may get to clean up the dead tuple's page and mark it all-visible before the visibility check occurs, incorrectly returning an 'all-visible' result for that dead tuple. Btree indexes interlock with vacuum using a buffer cleanup lock and a pin on pages it's yet to return results for; and that solution works quite fine [^1]. This same solution sadly doesn't work for GiST and SP-GiST, as in a worst case scenario they may have to sort the whole index in memory before they can return the first index tuple, and holding pins on those pages would be extremely punishing and might even cause system crashes due to a lack of available un-pinned shared buffers. To solve this, we implement a mechanism to allow indexes to do rough visibility checks on TIDs; the results of which can then be passed through the IndexScanDesc to indicate what the VM state was when the tuple was still in the index. This enables them to make the VM check happen just after they've scanned a page, but before they release their pin on the page, adding the interlock with VACUUM without requiring unreasonably large amounts of page pins. This new mechanism is safe in MVCC snapshots, where we know that tuples which are all-visible can't be removed while the scan is ongoing, and where possibly-dead possibly-replaced TIDs are known to be visibility-checked using transaction IDs, and where any new TID would have to be inserted in a different transaction and therefore is definitely invisible to our current transaction. ------------------------ Open items: review that this doesn't have any further issues, and commits once this has been considered good enough. Note: This patch changes TableAMRoutine and renames/changes exposed functions, and as a result can't be backpatched as-is. I have a separate thread over at [0] where I'm keeping track of a patchset that is derived from this one and is focused on backpatching. That patchset will contain patches 0004/0005/0006 and a reduced version of 0001+0003 to make it work in older branches without breaking external ABI compatibility. I intend for the exposed table_index_vischeck_tuples() API to remain consistent across the two patchsets. Kind regards, Matthias van de Meent Databricks (https://www.databricks.com) [0]: https://www.postgresql.org/message-id/flat/CAEze2WgH13m=MDST58KLo-NkZpbwBEt4xNWcgtghWBwRj3J0+A@mail.gmail.com [^1]: Mostly fine, because it still holds VACUUM back when an index-only scan holds a page pin and VACUUM needs to process that page. If the index scan doesn't progress, then VACUUM can't progress either, and that can cause vacuum to get stuck. That issue is solved (for normal index scans) with patch 0007.
Вложения
- v13-0001-Add-vectorized-API-for-visibility-map-lookup.patch
- v13-0008-Test-for-IOS-Vacuum-race-conditions-in-index-AMs.patch
- v13-0005-GIST-Fix-visibility-issues-in-IOS.patch
- v13-0007-nbtree-Reduce-Index-Only-Scan-pin-duration.patch
- v13-0006-SP-GIST-Fix-visibility-issues-in-IOS.patch
- v13-0004-IOS-Support-tableAM-powered-prechecked-visibilit.patch
- v13-0003-TableAM-Support-AM-specific-fast-visibility-test.patch
- v13-0002-pg_visibility-vectorize-collect_visibility_data.patch
В списке pgsql-hackers по дате отправления: