Re: Optimization for lower(), upper(), casefold() functions.
От | Jeff Davis |
---|---|
Тема | Re: Optimization for lower(), upper(), casefold() functions. |
Дата | |
Msg-id | 2c10910c21b16cb9a0e5f67d80589e3acae2b6ef.camel@j-davis.com обсуждение исходный текст |
Ответ на | Re: Optimization for lower(), upper(), casefold() functions. (Alexander Borisov <lex.borisov@gmail.com>) |
Ответы |
Re: Optimization for lower(), upper(), casefold() functions.
|
Список | pgsql-hackers |
On Wed, 2025-03-12 at 19:55 +0300, Alexander Borisov wrote: > 1. Added static for casemap() function. Otherwise the compiler could > not > optimize the code and the performance dropped significantly. Oops, it was static, but I made it external just to see what code it generated. I didn't intend to publish it as an external function -- thank you for catching that! > 2. Added a fast path for codepoint < 0x80. > > v3j-0002: > In the fast path for codepoints < 0x80, I added a premature return. > This avoided additional insertions, which increased performance. What do you mean "additional insertions"? Also, should we just compute the results in the fast path? We don't even need a table. Rough patch attached to go on top of v4-0001. Should we properly return CASEMAP_SELF when *simple == u1, or is it ok to return CASEMAP_SIMPLE? It probably doesn't matter performance-wise, but it feels more correct to return CASEMAP_SELF. > > Perhaps for general > beauty it should be made static inline, I don't have a rigid position > here. We ordinarily use "static inline" if it's in a header file, and "static" if it's in a .c file, so I'll do it that way. > I was purely based on existing approaches in Postgres, the > Normalization Forms have them separated into different headers. Just > trying to be consistent with existing approaches. I think that was done for normalization primarily because it's not used #ifndef FRONTEND (see unicode_norm.c), and perhaps also because it's just a more complex function worthy of its own file. I looked into the history, and commit 783f0cc64d explains why perfect hashing is not used in the frontend: "The decomposition table remains the same, getting used for the binary search in the frontend code, where we care more about the size of the libraries like libpq over performance..." > Regards, Jeff Davis
Вложения
В списке pgsql-hackers по дате отправления: