Re: speed up unicode normalization quick check

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: speed up unicode normalization quick check
Дата
Msg-id 20201007061844.GB30037@paquier.xyz
обсуждение исходный текст
Ответ на Re: speed up unicode normalization quick check  (Mark Dilger <mark.dilger@enterprisedb.com>)
Ответы Re: speed up unicode normalization quick check  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
On Sat, Sep 19, 2020 at 04:09:27PM -0700, Mark Dilger wrote:
> I am marking this ready for committer.  I didn't object to the
> whitespace weirdness in your patch (about which `git apply`
> grumbles) since you seem to have done that intentionally.  I have no
> further comments on the performance issue, since I don't have any
> other platforms at hand to test it on.  Whichever committer picks
> this up can decide if the issue matters to them enough to punt it
> back for further performance testing.

About 0001, the new set of multipliers looks fine to me.  Even if this
adds an extra item from 901 to 902 because this can be divided by 17
in kwlist_d.h, I also don't think that this is really much bothering
and.  As mentioned, this impacts none of the other tables that are much
smaller in size, on top of coming back to normal once a new keyword
will be added.  Being able to generate perfect hash functions for much
larger sets is a nice property to have.  While on it, I also looked at
the assembly code with gcc -O2 for keywords.c & co and I have not
spotted any huge difference.  So I'd like to apply this first if there
are no objections.

I have tested 0002 and 0003, that had better be merged together at the
end, and I can see performance improvements with MSVC and gcc similar
to what is being reported upthread, with 20~30% gains for simple
data sample using IS NFC/NFKC.  That's cool.

Including unicode_normprops_table.h in what gets ignored with pgindent
is also fine at the end, even with the changes to make the output of
the structures generated more in-line with what pgindent generates.
One tiny comment I have is that I would have added an extra comment in
the unicode header generated to document the set of structures
generated for the perfect hash, but that's easy enough to add.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: Resetting spilled txn statistics in pg_stat_replication
Следующее
От: Andrey Borodin
Дата:
Сообщение: Re: new heapcheck contrib module