On Thu, 2024-06-20 at 17:07 +0700, John Naylor wrote:
> On Sat, Jun 15, 2024 at 6:46 AM Jeff Davis <pgsql@j-davis.com> wrote:
> > Attached is a patch to use simplehash.h instead, which speeds
> > things up
> > enough to make them fairly close (from around 15% slower to around
> > 8%).
>
> +#define SH_HASH_KEY(tb, key) hash_uint32((uint32) key)
>
> For a static inline hash for speed reasons, we can use murmurhash32
> here, which is also inline.
Thank you, that brings it down a few more percentage points.
New patches attached, still based on the setlocale-removal patch
series.
Setup:
create collation libc_c (provider=libc, locale='C');
create table collation_cache_test(t text);
insert into collation_cache_test
select g::text||' '||g::text
from generate_series(1,200000000) g;
Queries:
select * from collation_cache_test where t < '0' collate "C";
select * from collation_cache_test where t < '0' collate libc_c;
The two collations are identical except that the former benefits from
the optimization for C_COLLATION_OID, and the latter does not, so these
queries measure the overhead of the collation cache lookup.
Results (in ms):
"C" "libc_c" overhead
master: 6350 7855 24%
v4-0001: 6091 6324 4%
(Note: I don't have an explanation for the difference in performance of
the "C" locale -- probably just some noise in the test.)
Considering that simplehash brings the worst case overhead under 5%, I
don't see a big reason to use the single-element cache also.
Regards,
Jeff Davis