Re: [PATCH] Completed unaccent dictionary with many missing characters

Поиск

Список

Период

Сортировка

От	Przemysław Sztoch
Тема	Re: [PATCH] Completed unaccent dictionary with many missing characters
Дата	5 мая 2022 г. 22:40:09
Msg-id	425e10c2-95ae-8ff4-4185-ab9ebbfff16f@sztoch.pl обсуждение исходный текст
Ответ на	Re: [PATCH] Completed unaccent dictionary with many missing characters (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

Peter Eisentraut wrote on 5/4/2022 5:17 PM:

On 28.04.22 18:50, Przemysław Sztoch wrote:
Current unnaccent dictionary does not include many popular numeric symbols,
in example: "m²" -> "m2"
Seems reasonable.

Can you explain what your patch does to achieve this?

I used an existing python implementation of the generator.
It is based on ready-made unicode dictionary: src/common/unicode/UnicodeData.txt.
The current generator was filtering UnicodeData.txt too much.
I relaxed these conditions, because the previous implementation focused only on selected character types.

Browsing the unaccent.rules file is the easiest way to see how many and what missing characters have been completed.

For FTS, the addition of these characters is very much needed.

--
Przemysław Sztoch | Mobile +48 509 99 00 66

В списке pgsql-hackers по дате отправления:

Предыдущее

От: "Imseih (AWS), Sami"
Дата: 05 мая 2022 г., 22:26:51
Сообщение: Re: Add index scan progress to pg_stat_progress_vacuum

Следующее

От: Przemysław Sztoch
Дата: 05 мая 2022 г., 22:44:15
Сообщение: Re: [PATCH] Completed unaccent dictionary with many missing characters

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [PATCH] Completed unaccent dictionary with many missing characters

Предыдущее

Следующее