Re: Speed up ICU case conversion by using ucasemap_utf8To*()

Поиск
Список
Период
Сортировка
От Andreas Karlsson
Тема Re: Speed up ICU case conversion by using ucasemap_utf8To*()
Дата
Msg-id c380cf11-dd23-4a75-af3e-3da8ec88bf6c@proxel.se
обсуждение исходный текст
Ответ на Re: Speed up ICU case conversion by using ucasemap_utf8To*()  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Speed up ICU case conversion by using ucasemap_utf8To*()
Список pgsql-hackers
On 12/20/24 8:24 PM, Jeff Davis wrote:
> On Fri, 2024-12-20 at 06:20 +0100, Andreas Karlsson wrote:
>> SELECT count(upper) FROM (SELECT upper(('Kålhuvud ' || i) COLLATE
>> "sv-SE-x-icu") FROM generate_series(1, 1000000) i);
>>
>> master:  ~540 ms
>> Patched: ~460 ms
>> glibc:   ~410 ms
> 
> It looks like you are opening and closing the UCaseMap object each
> time. Why not save it in pg_locale_t? That should speed it up even more
> and hopefully beat libc.

Fixed. New benchmarks are:

SELECT count(upper) FROM (SELECT upper(('Kålhuvud ' || i) COLLATE 
"sv-SE-x-icu") FROM generate_series(1, 1000000) i);

master:  ~570 ms
Patched: ~340 ms
glibc:   ~400 ms

So it does indeed seem like we got a further speedup and now are faster 
than glibc.

> Also, to support older ICU versions consistently, we need to fix up the
> locale name to support "und"; cf. pg_ucol_open(). Perhaps factor out
> that logic?

Fixed.

Andreas

Вложения

В списке pgsql-hackers по дате отправления: