* Robert Haas (robertmhaas@gmail.com) wrote:
> To throw out one more point that I think is problematic, Peter's
> original email on this thread gives a bunch of examples of strxfrm()
> normalization that all different in the first few bytes - but so do
> the underlying strings. I *think* (but don't have time to check right
> now) that on my MacOS X box, strxfrm() spits out 3 bytes of header
> junk and then 8 bytes per character in the input string - so comparing
> the first 8 bytes of the strxfrm()'d representation would amount to
> comparing part of the first byte. If for any reason the first byte is
> the same (or similar enough) on many of the input strings, then this
> will probably work out to be slower rather than faster. Even if other
> platforms are more space-efficient (and I think at least some of them
> are), I think it's unlikely that this optimization will ever pay off
> for strings that don't differ in the first 8 bytes. And there are
> many cases where that could be true a large percentage of the time
> throughout the input, e.g. YYYY-MM-DD HH:MM:SS timestamps stored as
> text. It seems like that the patch pessimizes those cases, though of
> course there's no way to know without testing.
Portability and performance concerns were exactly what worried me as
well. It was my hope/understanding that this was a clear win which was
vetted by other large projects across multiple platforms. If that's
actually in doubt and it isn't a clear win then I agree that we can't be
trying to squeeze it in at this late date.
> Now it *may well be* that after doing some research and performance
> testing we will conclude that either no commonly-used platforms show
> any regressions or that the regressions that do occur are discountable
> in view of the benefits to more common cases to the benefits. I just
> don't think mid-April is the right time to start those discussions
> with the goal of a 9.4 commit; and I also don't think committing
> without having those discussions is very prudent.
I agree with this in concept- but I'd be willing to spend a bit of time
researching it, given that it's from a well known and respected author
who I trust has done much of this research already.
Thanks,
Stephen