Re: BUG #15548: Unaccent does not remove combining diacritical characters

Поиск
Список
Период
Сортировка
От Hugh Ranalli
Тема Re: BUG #15548: Unaccent does not remove combining diacritical characters
Дата
Msg-id CAAhbUMOX4QLj6c0O3GnjZYtR2dpAowss832Bq1n7oJyByeR7kQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #15548: Unaccent does not remove combining diacritical characters  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: BUG #15548: Unaccent does not remove combining diacritical characters  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs

On Sat, 15 Dec 2018 at 21:26, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
+1 for updating to the latest file from time to time.  After
http://unicode.org/cldr/trac/ticket/11383 makes it into a new release,
our special_cases() function will have just the two Cyrillic
characters, which should almost certainly be handled by adding
Cyrillic to the ranges we handle via the usual code path, and DEGREE
CELSIUS and DEGREE FAHRENHEIT.  Those degree signs could possibly be
extracted from Unicode.txt (or we could just forget about them), and
then we could drop special_cases().
Well, when I modified the code to handle the new version of the transliteration file, I discovered that was sufficient to handle the old version as well. That's not the way things usually go, but I'll take it. ;-)

I've attached two patches, one to update generate_unaccent_rules.py, and another that updates unaccent.rules from the v34 transliteration file. I'll be happy to add these to the CF. Does anyone need to review them and give me approval before I do so?

Best wishes,
Hugh 

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #15556: Duplicate key violations even when using ON CONFLICTDO UPDATE
Следующее
От: Tom Lane
Дата:
Сообщение: Re: BUG #15555: Syntax errors when using the COMMENT command in plpgsql and a "comment" variable