Re: Moving _bt_readpage and _bt_checkkeys into a new .c file
| От | Peter Geoghegan |
|---|---|
| Тема | Re: Moving _bt_readpage and _bt_checkkeys into a new .c file |
| Дата | |
| Msg-id | CAH2-WzkdHtEq59vJWzdi7DaZkihRay=dJZ3BgBZ9WM9pxxwBVg@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Moving _bt_readpage and _bt_checkkeys into a new .c file (Peter Geoghegan <pg@bowt.ie>) |
| Ответы |
Re: Moving _bt_readpage and _bt_checkkeys into a new .c file
|
| Список | pgsql-hackers |
On Sat, Dec 6, 2025 at 3:04 PM Peter Geoghegan <pg@bowt.ie> wrote: > My best guess is that the benefits I see come from eliminating a > dependent load. Without the second patch applied, I see this > disassembly for _bt_checkkeys: > > mov rax,QWORD PTR [rdi+0x38] ; Load scan->opaque > mov r15d,DWORD PTR [rax+0x70] ; Load so->dir > > A version with the second patch applied still loads a pointer passed > by the _bt_checkkeys caller (_bt_readpage), but doesn't have to chase > another pointer to get to it. Maybe this significantly ameliorates > execution port pressure in the cases where I see a speedup? I found a way to further speed up the queries that the second patch already helped with, following profiling with perf: if _bt_readpage takes a local copy of scan->ignore_killed_tuples when first called, and then uses that local copy within its per-tuple loop (instead of using scan->ignore_killed_tuples directly), it gives me an additional 1% speedup over what I reported earlier today. In other words, the range/BETWEEN pgbench variant I summarized earlier today goes from being about 4.5% faster than master, to being about ~5.5% faster than master. Testing has also shown that the ignore_killed_tuples enhancement doesn't significantly change the picture with other types of queries (such as the default pgbench SELECT). In short, this ignore_killed_tuples change makes the second patch from v1 more effective, seemingly by further ameliorating the same bottleneck. Apparently accessing scan->ignore_killed_tuples created another load-use hazard in the same tight inner loop (the per-tuple _bt_readpage loop). Which matters with these queries, where we don't need to do very much work per-tuple (_bt_readpage's pstate.startikey optimization is as effective as possible here) and have quite a few tuples (2,000 tuples) that need to be returned by each test query run. Since this ignore_killed_tuples change is also very simple, and also seems like an easy win, I think that it can be committed as part of the second patch. Without it needing to wait for too much more performance validation. Attached are 2 text files showing pgbench output/summary info, generated by my test script (both are from runs that took place within the last 2 hours). One of these result sets just confirms what I reported earlier on, with an unmodified v1 patchset. The other set of results/file shows detailed results for the v1 patchset with the ignore_killed_tuples change also applied, for the same pgbench config/workload. This second file gives full details to back up my "~5.5% faster than master" claim. The pgbench script used for this is as follows: \set aid random_exponential(1, 100000 * :scale, 3.0) \set endrange :aid + 2000 SELECT abalance FROM pgbench_accounts WHERE aid between :aid AND :endrange; I'm deliberately not attaching a new v2 for this ignore_killed_tuples change right now. The first patch is a few hundred KBs, and I don't want this email to get held up in moderation. Though I will attach the ignore_killed_tuples change in its own patch, which I've also attached (with a .txt extension, just to avoid confusing CFTester). -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: