Re: Moving _bt_readpage and _bt_checkkeys into a new .c file
| От | Peter Geoghegan |
|---|---|
| Тема | Re: Moving _bt_readpage and _bt_checkkeys into a new .c file |
| Дата | |
| Msg-id | CAH2-Wz=nntKkr2wAK92zyiaSMzfQa81TqA0HtezwxXoRKYB6+A@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Moving _bt_readpage and _bt_checkkeys into a new .c file (Victor Yegorov <vyegorov@gmail.com>) |
| Ответы |
Re: Moving _bt_readpage and _bt_checkkeys into a new .c file
|
| Список | pgsql-hackers |
On Sat, Dec 6, 2025 at 3:07 AM Victor Yegorov <vyegorov@gmail.com> wrote: > I like this change and I agree that it's both handy and gives an easy performance boost. There's a number of things that I find counterintuitive about the performance impact of this patch: * It doesn't make very much difference (under a 1% improvement) on similar index-only scans, with queries such as "SELECT count(*) FROM pgbench_accounts WHERE aid between :aid AND :endrange". One might expect this case to be improved at least as much as the plain index scan case, since the relative cost of _bt_readpage is higher (since it's much cheaper to access the visibility map than to access the heap). * The patch makes an even larger difference when I make pgbench use a larger range of values in the "between". For example, if I increase the number of values/tuples returned from 2,000 (which is what I used initially/reported on in the first email) to 15,000, I find that the patch increases TPS by as much as 5.5%. * These queries make maximum use of the _bt_set_startikey optimization -- most individual leaf pages don't need to evaluate any scan key (after an initial page-level check within _bt_set_startikey). So the patch really helps in exactly those cases where we don't truly need to access the scan direction at all -- the loop inside _bt_check_compare always has 0 iterations with these queries, which means that scan direction doesn't actually ever need to be considered at that point. My best guess is that the benefits I see come from eliminating a dependent load. Without the second patch applied, I see this disassembly for _bt_checkkeys: mov rax,QWORD PTR [rdi+0x38] ; Load scan->opaque mov r15d,DWORD PTR [rax+0x70] ; Load so->dir A version with the second patch applied still loads a pointer passed by the _bt_checkkeys caller (_bt_readpage), but doesn't have to chase another pointer to get to it. Maybe this significantly ameliorates execution port pressure in the cases where I see a speedup? > Patch applies and compiles cleanly. I can barely see a performance boost on my end (VM on a busy host), round 1%, but Istill consider this change beneficial. It seems to have no downsides, and some upside. I wouldn't be surprised if the results I'm seeing are dependent on microarchitectural details. I myself use a Zen 3 chip (a Ryzen 9 5950X). -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: