Re: Optimization for lower(), upper(), casefold() functions.
От | Alexander Borisov |
---|---|
Тема | Re: Optimization for lower(), upper(), casefold() functions. |
Дата | |
Msg-id | 44005c3d-88f4-4a26-981f-fd82dfa8e313@gmail.com обсуждение исходный текст |
Ответ на | Re: Optimization for lower(), upper(), casefold() functions. (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: Optimization for lower(), upper(), casefold() functions.
|
Список | pgsql-hackers |
12.03.2025 22:39, Jeff Davis wrote: [...] >> 2. Added a fast path for codepoint < 0x80. >> >> v3j-0002: >> In the fast path for codepoints < 0x80, I added a premature return. >> This avoided additional insertions, which increased performance. > > What do you mean "additional insertions"? Sorry for my English. I mean, we immediately do a return in the if () condition. To avoid further branching/checking. > Also, should we just compute the results in the fast path? We don't > even need a table. Rough patch attached to go on top of v4-0001. > > Should we properly return CASEMAP_SELF when *simple == u1, or is it ok > to return CASEMAP_SIMPLE? It probably doesn't matter performance-wise, > but it feels more correct to return CASEMAP_SELF. It seems to disrupt the overall "beauty" of the approach. Thus, we will copy code (bloat code), make optimizations that do not improve performance but bloat code. I would refrain from such practices. Especially since we'll be changing all that in the next patch (v4-0002). >> >> Perhaps for general >> beauty it should be made static inline, I don't have a rigid position >> here. > > We ordinarily use "static inline" if it's in a header file, and > "static" if it's in a .c file, so I'll do it that way. Great, I've changed this place. Performance has not changed in any way. >> I was purely based on existing approaches in Postgres, the >> Normalization Forms have them separated into different headers. Just >> trying to be consistent with existing approaches. > > I think that was done for normalization primarily because it's not used > #ifndef FRONTEND (see unicode_norm.c), and perhaps also because it's > just a more complex function worthy of its own file. > > I looked into the history, and commit 783f0cc64d explains why perfect > hashing is not used in the frontend: > > "The decomposition table remains the same, getting used for the binary > search in the frontend code, where we care more about the size of the > libraries like libpq over performance..." I removed the extra file (unicode_case_func.h). You are right, we should not create unnecessary clutter. v5 attached. Regards, Alexander Borisov
Вложения
В списке pgsql-hackers по дате отправления: