Re: BUG #15548: Unaccent does not remove combining diacritical characters

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #15548: Unaccent does not remove combining diacritical characters
Дата
Msg-id 16726.1544827803@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #15548: Unaccent does not remove combining diacritical characters  (Hugh Ranalli <hugh@whtc.ca>)
Ответы Re: BUG #15548: Unaccent does not remove combining diacritical characters  (Hugh Ranalli <hugh@whtc.ca>)
Список pgsql-bugs
Hugh Ranalli <hugh@whtc.ca> writes:
> I've attached a patch removes combining diacriticals. As with Latin and
> Greek letters, it uses ranges to restrict its activity.

Cool.  Please add it to the current CF so we don't forget about it:
https://commitfest.postgresql.org/21/

> I have not submitted a patch for unaccent.rules, as it seems that a rules
> file generated from generate_unaccent_rules.py will actually remove a large
> number of rules (even before my changes), such as replacing the copyright
> symbol © with (C), as well as other accented characters. It's probably
> worth asking if the shipped unaccent.rules should correspond to what the
> shipped generation utility produces, or not. I was surprised to see that it
> didn't.

Me too -- seems like that bears looking into.  Perhaps the script's
results are platform dependent -- what were you testing on?

            regards, tom lane


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Hugh Ranalli
Дата:
Сообщение: Re: BUG #15548: Unaccent does not remove combining diacritical characters
Следующее
От: Hugh Ranalli
Дата:
Сообщение: Re: BUG #15548: Unaccent does not remove combining diacritical characters