Re: More speedups for tuple deformation
| От | David Rowley |
|---|---|
| Тема | Re: More speedups for tuple deformation |
| Дата | |
| Msg-id | CAApHDvoh3Q413szd-zsUTCpQPWNdpUYvx-fvsB8DP8zOja+ckg@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: More speedups for tuple deformation (David Rowley <dgrowleyml@gmail.com>) |
| Ответы |
Re: More speedups for tuple deformation
|
| Список | pgsql-hackers |
On Fri, 2 Jan 2026 at 18:58, David Rowley <dgrowleyml@gmail.com> wrote: > Please find attached an updated set of patches. A rebase was needed, > plus 0003 had a problem with an Assert not handling the bitmap being a > NULL pointer. Another rebase and updates to some newly created missing calls to TupleDescFinalize(). I've also attached another round of benchmarks after dipping into some Azure machines to cover my lack of any Intel benchmark results. I think these are somewhat noisy as I opted for low core-count instances which will have L3 shared with workloads running for other people. This is most evident in Xeon_E5-2673 with gcc where the patched run was nearly twice as fast as unpatched for test 2 on 20 extra columns. If you look at the raw results from that, you can see the times are quite unstable between the 3 runs of each test, which makes me believe that the machine was busy with other work when that test ran on master. The AMD3990x and M2 machines are all sitting next to me and were otherwise idle, so they should be much more stable. Quite a few machines have a small regression for the 0 extra column tests. There is a small amount of extra work being done in the deforming function to check if the attnum < the first attribute without an attcacheoff. This mostly only affects the tests that don't do any deforming with a cached attcacheoff, e.g due to NULLs or varlena types. The only way I've thought about to possibly reduce that is to invent a new TupleTableSlotOps and pick the one that applies when creating the TupleTableSlot. This doesn't appeal to me very much as it requires modifying many callsites. But I do wonder if we should try to come up with something here as technically we could use this to eliminate alignment padding out of some MinimalTuples in some cases where these were not directly derived from pre-formed HeapTuples. That could allow a more compact tuple representation for sorting and hashing, allowing us to do more with less memory in some cases. The benchmark results also indicated that there wasn't much advantage to the 0002+0003 patches, so I've removed those from the set. That reduces some complexity around the benchmarks. I did still keep the OPTIMIZE_BYVAL loop as separate results. It's not quite clear what's best there as machines seem to vary on which they prefer. Benchmark results attached in the bz2 file both in spreadsheet form and the raw results pg_dumped. David
Вложения
В списке pgsql-hackers по дате отправления: