Re: [HACKERS] Extra Vietnamese unaccent rules

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: [HACKERS] Extra Vietnamese unaccent rules
Дата
Msg-id CAB7nPqQg4jioETBudh0VhpS6s3NWmC4OWqcTiCx_ZHBa8p19_A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [HACKERS] Extra Vietnamese unaccent rules  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
On Mon, May 29, 2017 at 10:47 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
>> [Quoting Michael]
>>> Actually, with the recent work that has been done with
>>> unicode_norm_table.h which has been to transpose UnicodeData.txt into
>>> user-friendly tables, shouldn't the python script of unaccent/ be
>>> replaced by something that works on this table? This does a canonical
>>> decomposition but just keeps the first characters with a class
>>> ordering of 0. So we have basic APIs able to look at UnicodeData.txt
>>> and let caller do decision making with the result returned.
>>
>> Thanks, i will learning about it.
>
> It seems like that could be useful for runtime use (I'm sure there is
> a whole world of Unicode support we could add), but here we're only
> trying to generate a mapping file to add to the source tree, so I'm
> not sure how it's relevant.

Yes, that's what I am coming at, but that would be really dictionnary
specific and that would be roughly to provide a fast-path equivalent
to the tsearch_readline* routines working on files. The addition of
new infrastructure may perhaps not be worth the performance gains.
Definitely for this fix there is no need to do anything more
complicated than tweaking the script to generate the rules.
-- 
Michael



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: [HACKERS] Index created in BEFORE trigger not updated duringINSERT
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: [HACKERS] PG10 transition tables, wCTEs and multiple operations on the same table