Re: Batch TIDs lookup in ambulkdelete
От | Junwang Zhao |
---|---|
Тема | Re: Batch TIDs lookup in ambulkdelete |
Дата | |
Msg-id | CAEG8a3Ld5bTU+oCkra5--2-WmFQbM4jmtmNe9FBwvDv2-+ZvxQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Batch TIDs lookup in ambulkdelete (Junwang Zhao <zhjwpku@gmail.com>) |
Список | pgsql-hackers |
On Sat, Jun 7, 2025 at 8:46 PM Junwang Zhao <zhjwpku@gmail.com> wrote: > > Hi Masahiko, > > On Sat, Jun 7, 2025 at 5:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Tue, May 13, 2025 at 2:26 PM Matheus Alcantara > > <matheusssilv97@gmail.com> wrote: > > > > > > Hi, > > > > > > On 30/04/25 19:36, Masahiko Sawada wrote: > > > > Here are the summary of each attached patch: > > > > > > > > 0001: Introduce TIdStoreIsMemberMulti() where we can do IsMember check > > > > for multiple TIDs in one function call. If the given TIDs are sorted > > > > (at least in block number), we can save radix tree lookup for the same > > > > page entry. > > > > > > > > 0002: Convert IndexBUlkDeleteCallback() to batched operation. > > > > > > > > 0003: Use batch TIDs lookup in btree index bulk-deletion. > > > > > > > > In patch 0003, we implement batch TID lookups for both each > > > > deduplicated index tuple and remaining all regular index tuples, which > > > > seems to be the most straightforward approach. While further > > > > optimizations are possible, such as performing batch TID lookups for > > > > all index tuples on a single page, these could introduce additional > > > > overhead from sorting and re-sorting TIDs. Moreover, considering that > > > > radix tree lookups are relatively inexpensive, the benefits of sorting > > > > TIDs and using TidStoreIsMemberMulti() might be minimal. Nevertheless, > > > > these potential optimizations warrant further evaluation to determine > > > > their actual impact on performance. > > > > > > > > Also, the patch includes the batch TIDs lookup support only for btree > > > > indexes but we potentially can improve other index AMs too. > > > > > > > > > > The code looks good and also +1 for the idea. I just have some small > > > points: > > > - Maybe it would be good to mention somewhere that > > > IndexBulkDeleteCallback() callback returns the number of tids > > > members found on TidStore? > > > - The vac_tid_reaped() docs may need to be updated? > > > > Thank you for looking at the patches. I agree with the above comments. > > > > > > > > I also executed meson tests for each patch individually and the 0002 > > > patch is not passing on my machine (MacOs). > > > > > > Ok: 39 > > > Expected Fail: 0 > > > Fail: 271 > > > Unexpected Pass: 0 > > > Skipped: 22 > > > Timeout: 0 > > > > > > One behaviour that I found by executing the 0002 tests is that it may be > > > leaking some shared memory segments. I notice that because after > > > executing the tests I tried to re-execute based on master and all tests > > > were failing with the "Failed system call was shmget(key=97530599, > > > size=56, 03600)" error. I also checked the shared memory segments using > > > "ipcs -m" and it returns some segments which is don't returned when I > > > execute the tests on master (after cleaning the leaked memory segments) > > > and it also doesn't occur when executing based on 0001 or 0003. > > > > > > ~/d/p/batch-tids-lookup-ambulkdelete ❯❯❯ ipcs -m > > > IPC status from <running system> as of Tue May 13 18:19:14 -03 2025 > > > T ID KEY MODE OWNER GROUP > > > Shared Memory: > > > m 18087936 0x05f873bf --rw------- matheus staff > > > m 15925250 0x05f966fe --rw------- matheus staff > > > m 24248325 0x05f9677e --rw------- matheus staff > > > .... > > > > > > Note that the the 0003 patch don't have this issue so at the end we will > > > not have problem with this I think, but it maybe be good to mention that > > > although the patches are separate, there is a dependency between them, > > > which may cause issues on buildfarm? > > > > Thank you for the report. With the 0001 and 0002 patches, I got a > > SEGV. I've fixed this issue in the attached updated version patches. > > I've confirmed that the patches pass CI tests but I'm not sure it > > fixes the shared memory segment leak problem you reported. The > > attached patches incorporated the comments[1] from John as well. > > > > BTW I found that the constant 'maxblkno' in test_tidstore.sql actually > > equals to InvalidBlockNumber, but not MaxBlockNumber. I think it > > doesn't make sense that TidStore uses InvalidBlockNumber as the key. > > The attached 0001 patch fixes it. I think we can fix it separately on > > HEAD as well as back branches. > > > > Regards, > > > > [1] https://www.postgresql.org/message-id/CANWCAZbiJcwSBCczLfbfiPe1mET+V9PjTZv5VvUBwarLvx1Hfg@mail.gmail.com > > > > -- > > Masahiko Sawada > > Amazon Web Services: https://aws.amazon.com > > + /* > + * We will sort the deletable array if there are existing > + * offsets as it should be sorted in ascending order (see > + * _bt_delitems_vacuum()). > + */ > + need_sort = (ndeletable > 0); > + > + ndels = callback(workbuf_htids, workbuf_nitem, workbuf_deletable, > + callback_state); > + if (ndels > 0) > + { > + for (int i = 0; i < workbuf_nitem; i++) > + { > + if (workbuf_deletable[i]) > + deletable[ndeletable++] = workbuf_offs[i]; > + } > + > + if (need_sort) > + qsort(deletable, ndeletable, sizeof(OffsetNumber), cmpOffsetNumbers); > + > + nhtidsdead += ndels; > + } > > I think the need_sort should be calculated after the callback? Maybe just: > > if (ndeletable > 1) > qsort(deletable, ndeletable, sizeof(OffsetNumber), cmpOffsetNumbers); > > I think there is no need to sort when ndeletable is 1, so here compare to 1. > After further study of the patch, I realize I misunderstood the logic here, qsort is only needed when there are other existing items in deletable before the callback, so the code LGTM ;) > > -- > Regards > Junwang Zhao -- Regards Junwang Zhao
В списке pgsql-hackers по дате отправления: