[WiP] B-tree page merge during vacuum to reduce index bloat
От | Andrey Borodin |
---|---|
Тема | [WiP] B-tree page merge during vacuum to reduce index bloat |
Дата | |
Msg-id | 11A59C0C-A8C8-4642-8493-292D5DF8311D@yandex-team.ru обсуждение исходный текст |
Ответы |
Re: [WiP] B-tree page merge during vacuum to reduce index bloat
|
Список | pgsql-hackers |
Hi hackers, Together with Kirk and Nik we spent several online hacking sessions to tackle index bloat problems [0,1,2]. As a result weconcluded that B-tree index page merge should help to keep an index from pressuring shared_buffers. We are proposing a feature to automatically merge nearly-empty B-tree leaf pages during VACUUM operations to reduce indexbloat. This addresses a common issue where deleted tuples leave behind sparsely populated pages that traditional pagedeletion cannot handle because they're not completely empty. *** Implementation Overview: The patch adds a new `mergefactor` reloption (default 5%, range 0-50%) that defines when a page becomes a candidate for merging.During vacuum, when a page exceeds this threshold (e.g., 95% empty with default settings), we attempt to move theremaining tuples to the right sibling page and then delete the source page using existing page deletion mechanisms. Key changes: - New `mergefactor` reloption for configurable merge thresholds - Detection logic in `btvacuumpage()` to identify merge candidates - Tuple movement implementation in `_bt_unlink_halfdead_page()` - WAL logging enhancements to handle cross-page dependencies during replay The last part needs further improvements (it's simply REGBUF_FORCE_IMAGE for now), but I want to start a discussion and askfor known problems of the approach. *** Correctness: The implementation reuses existing locking, critical sections and WAL logging infrastructure. To handle cross-page dependenciesduring WAL replay, when tuples are merged, the right sibling buffer is registered with `REGBUF_FORCE_IMAGE`,this is a temporary workaround. *** Current Status & Questions: The patch successfully reduces index bloat and handles basic scenarios, but we've identified some areas that need communityinput: 1. Backward Scan Correctness: The primary concern is ensuring backward scans work correctly when pages are being merged concurrently.Since we maintain the same locking protocol as existing page deletion, I believe this should be safe, but wouldappreciate expert review of the approach. 2. Performance Impact: The additional checks during vacuum have minimal overhead, but broader testing would be valuable.Worst case would be the index with leaf pattern (5%,96%,5%,96%,5%,96%...). We will attempt to merge it every timespending time on acquiring locks. 3. WAL Consistency: There are still some edge cases with WAL consistency checking that need refinement. I think I can handleit, just need to spend enough time on debugging real redo instead of imaging right page. *** Usage: CREATE INDEX ON table (col) WITH (mergefactor=10); -- 10% threshold I don't know if it would be a good idea to enable mergefactor for existing indexes. The feature is particularly beneficial for workloads with high update/delete rates that create sparse index pages withouttriggering complete page deletion. I'm attaching the patch for review and would welcome feedback on the approach, especially regarding backward scan safetyand any other correctness concerns we may have missed. Thank you! Best regards, Andrey, Nik, Kirk. [0] https://www.youtube.com/watch?v=3MleDtXZUlM [1] https://www.youtube.com/watch?v=Ib3SXSFt8mE [2] https://www.youtube.com/watch?v=D1PEdDcvZTw
Вложения
В списке pgsql-hackers по дате отправления: