Re: speed up unicode normalization quick check

Поиск

Список

Период

Сортировка

От	Michael Paquier
Тема	Re: speed up unicode normalization quick check
Дата	8 октября 2020 г. 06:48:23
Msg-id	20201008064823.GC3457@paquier.xyz обсуждение исходный текст
Ответ на	Re: speed up unicode normalization quick check (Michael Paquier <michael@paquier.xyz>)
Ответы	Re: speed up unicode normalization quick check
Список	pgsql-hackers

Дерево обсуждения

On Wed, Oct 07, 2020 at 03:18:44PM +0900, Michael Paquier wrote:
> About 0001, the new set of multipliers looks fine to me.  Even if this
> adds an extra item from 901 to 902 because this can be divided by 17
> in kwlist_d.h, I also don't think that this is really much bothering
> and.  As mentioned, this impacts none of the other tables that are much
> smaller in size, on top of coming back to normal once a new keyword
> will be added.  Being able to generate perfect hash functions for much
> larger sets is a nice property to have.  While on it, I also looked at
> the assembly code with gcc -O2 for keywords.c & co and I have not
> spotted any huge difference.  So I'd like to apply this first if there
> are no objections.

I looked at this one again today, and applied it.  I looked at what
MSVC compiler was able to do in terms of optimizations with
shift-and-add for multipliers, and it is by far not as good as gcc or
clang, applying imul for basically all the primes we could use for the
perfect hash generation.

> I have tested 0002 and 0003, that had better be merged together at the
> end, and I can see performance improvements with MSVC and gcc similar
> to what is being reported upthread, with 20~30% gains for simple
> data sample using IS NFC/NFKC.  That's cool.

For these two, I have merged both together and did some adjustments as
per the attached.  Not many tweaks, mainly some more comments for the
unicode header files as the number of structures generated gets
higher.  FWIW, with the addition of the two hash tables,
libpgcommon_srv.a grows from 1032600B to 1089240B, which looks like a
small price to pay for the ~30% performance gains with the quick
checks.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: speed up unicode normalization quick check

Вложения