On Tue, Jul 5, 2022 at 12:14 PM John Naylor <
john.naylor@enterprisedb.com> wrote:
> I suspect the branches in the sorttuple comparators for null handling
> are negating the benefits from better d-cache and TLB behavior. In
> that case, a useful thing to try would be to pre-partition the
> sorttuple array between null and non-null values as it's being
> populated, then sort each of those in turn. (The null portion would
> only need sorting if there was more than one sort key.) That would
> automatically take care of nulls-first vs nulls-last upfront, save a
> bunch of binary space used for branches (worth doing on its own),
To see if this idea had any legs, I simply ripped out the null handling from the specialized sorttuple comparators. This works for my tests since all the values are not null. Doing this much seems to be more effective than dual pivot in every case but purely-descending input. The combination of the two seems a bit faster still on bigint, but not on text. From this test, it seems the conclusions for tuple sorts with dual-pivot haven't changed: impressive speedup on descending input, not so much elsewhere. (spreadsheet attached, note the insertion sort thresholds are reused from previous testing, namely 10 and 18 for single / dual pivot)
> and
> the "isnull" member would not be needed anymore. That wouldn't save
> data space because of alignment, but the compiler wouldn't need to
> generate a load/store instruction for it.
This piece might not be workable, or would make external merges more difficult. That said, doing the null-handling up front when forming sort tuples seems worth doing, and I intend to come back to it at some point.
Regarding binary size, removing the null handling from the comparators shrinks the binary by about 4.3kB. Adding dual pivot on top of that expands it to about 3.5kB larger than HEAD, but I'm pretty sure I could get that down to net zero by replacing the presorted check with a partial insertion sort (as discussed before), and reusing that for the pivot selection as well. Being able to add partial insertion sort with net ~zero binary increase could be another small point in favor of dual-pivot.
I've attached the patches for this experiment for completeness, but withdrawing the CF entry for now.
--
John Naylor
EDB: 
http://www.enterprisedb.com