Re: Improve the performance of Unicode Normalization Forms.
От | Jeff Davis |
---|---|
Тема | Re: Improve the performance of Unicode Normalization Forms. |
Дата | |
Msg-id | 4211ffd7fe154c4af693b98d78f4a3689ce8cc30.camel@j-davis.com обсуждение исходный текст |
Ответ на | Improve the performance of Unicode Normalization Forms. (Alexander Borisov <lex.borisov@gmail.com>) |
Ответы |
Re: Improve the performance of Unicode Normalization Forms.
|
Список | pgsql-hackers |
On Tue, 2025-06-03 at 00:51 +0300, Alexander Borisov wrote: > As promised, I continue to improve/speed up Unicode in Postgres. > Last time, we improved the lower(), upper(), and casefold() > functions. [1] > Now it's time for Unicode Normalization Forms, specifically > the normalize() function. Did you compare against other implementations, such as ICU's normalization functions? There's also a rust crate here: https://github.com/unicode-rs/unicode-normalization that might have been optimized. In addition to the lookups themselves, there are other opportunities for optimization as well, such as: * reducing the need for palloc and extra buffers, perhaps by using buffers on the stack for small strings * operate more directly on UTF-8 data rather than decoding and re- encoding the entire string Regards, Jeff Davis
В списке pgsql-hackers по дате отправления: