Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array
| От | Xuneng Zhou |
|---|---|
| Тема | Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array |
| Дата | |
| Msg-id | CABPTF7WyXGvJrzJoZ+g5UCLTj3-HCMTtU4iHmU=FryTTOTf1dw@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array (Xuneng Zhou <xunengzhou@gmail.com>) |
| Ответы |
Re: Optimize SnapBuildPurgeOlderTxn: use in-place compaction instead of temporary array
|
| Список | pgsql-hackers |
On Mon, Nov 10, 2025 at 11:22 AM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > Hi, > > With a sorted commited.xip array, we could replace the iteration with > two binary searches to find the interval to keep. > > Proposed Optimization > --------------------- > > Use binary search to locate the boundaries of XIDs to remove, then > compact with a single memmove. The key insight requires understanding > how XID precedence relates to numeric ordering. > > XID Precedence Definition > ~~~~~~~~~~~~~~~~~~~~~~~~~~ > > PostgreSQL defines XID precedence as: > > /* compare two XIDs already known to be normal; this is a macro for speed */ > #define NormalTransactionIdPrecedes(id1, id2) \ > (AssertMacro(TransactionIdIsNormal(id1) && TransactionIdIsNormal(id2)), \ > (int32) ((id1) - (id2)) < 0) > > This means: id1 precedes id2 if (int32)(id1 - id2) < 0. > > Equivalently, this identifies all XIDs in the modular interval > [id2 - 2^31, id2) on the 32-bit ring as "preceding id2". So XIDs > preceding xmin are exactly those in [xmin - 2^31, xmin). > > From Modular Interval to Array Positions > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > The arrays are sorted in numeric uint32 order (xip[i] <= xip[i+1] in > unsigned sense), which is a total order—not wraparound-aware. Therefore, > the modular interval we want to remove may appear as one or two numeric > blocks in the sorted array. > > Let boundary = xmin - 2^31 (mod 2^32). The modular interval [boundary, xmin) > contains all XIDs to remove (half-open: xmin itself is kept, matching > NormalTransactionIdPrecedes). Where does it appear in the numerically sorted > array? > > Case A: (uint32)boundary <= (uint32)xmin (numeric no wrap) > Example: xmin = 3,000,000,000 > boundary = 3,000,000,000 - 2,147,483,648 = 852,516,352 > > Here, (uint32)boundary < (uint32)xmin, so the interval does not cross > 0 numerically. In the sorted array, XIDs to remove form one contiguous > block: [idx_boundary, idx_xmin). > > Array layout: > [... keep ...][=== remove ===][... keep ...] > 0 ............ idx_boundary ... idx_xmin ...... n > > Action: Keep prefix [0, idx_boundary) and suffix [idx_xmin, n). > > Case B: (uint32)boundary > (uint32)xmin (numeric wrap) > Example: xmin = 100 > boundary = 100 - 2^31 (mod 2^32) = 2,147,483,748 > > Since (uint32)boundary > (uint32)xmin, the interval wraps through 0 > numerically. In the sorted array, XIDs to remove form two blocks: > [0, idx_xmin) and [idx_boundary, n). > > Array layout: > [= remove =][... keep ...][= remove =] > 0 ......... idx_xmin .... idx_boundary ......... n > > Action: Keep only the middle [idx_xmin, idx_boundary). > > Note: Case B often occurs when xmin is "small" (e.g., right after > startup), making xmin - 2^31 wrap numerically. This is purely about > positions in the numeric order; it does not imply the cluster has > "wrapped" XIDs operationally. > > In both cases, we locate idx_boundary and idx_xmin using binary search > in O(log n) time, then use one memmove to compact > > The algorithm: > 1. Compute boundary = xmin - 2^31 > 2. Binary search for idx_boundary (first index with xip[i] >= boundary) > 3. Binary search for idx_xmin (first index with xip[i] >= xmin) > 4. Use memmove to compact based on case A or B above > > Benefits > -------- > > 1. Performance: O(log n) binary search vs O(n) linear scan > 2. Memory: No workspace allocation needed > 3. Simplicity: One memmove instead of allocate + two copies + free > > The same logic is applied to both committed.xip and catchange.xip arrays. > > Faster binary search > -------- > > While faster binary search variants exist, the current code already > introduces more complexity than the original. It’s uncertain that > further optimization would deliver a meaningful performance gain. > Adapt the patch with two-phase optimization: - Pre-CONSISTENT: Use in-place compaction O(n) since committed.xip is unsorted during this phase. - Post-CONSISTENT: Use binary search O(log n) since committed.xip is maintained in sorted order after reaching consistency. -- Best, Xuneng
Вложения
В списке pgsql-hackers по дате отправления: