Hi,
On 2024-02-07 16:21:24 -0600, Nathan Bossart wrote:
> On Wed, Feb 07, 2024 at 01:48:57PM -0800, Andres Freund wrote:
> > Now, in most cases this won't matter, the sorting isn't performance
> > critical. But I don't think it's a good idea to standardize on a generally
> > slower pattern.
> >
> > Not that that's a good test, but I did quickly benchmark [1] this with
> > intarray. There's about a 10% difference in performance between using the
> > existing compASC() and one using
> > return (int64) *(const int32 *) a - (int64) *(const int32 *) b;
> >
> >
> > Perhaps we could have a central helper for this somewhere?
>
> Maybe said helper could use __builtin_sub_overflow() and fall back to the
> slow "if" version only if absolutely necessary.
I suspect that'll be worse code in the common case, given the cmov generated
by gcc & clang for the typical branch-y formulation. But it's worth testing.
> The assembly for that looks encouraging, but I still need to actually test
> it...
Possible. For 16bit upcasting to 32bit is clearly the best way. For 32 bit
that doesn't work, given the 32bit return, so we need something more.
Greetings,
Andres Freund