Re: BUG #15548: Unaccent does not remove combining diacritical characters

Поиск
Список
Период
Сортировка
От Ramanarayana
Тема Re: BUG #15548: Unaccent does not remove combining diacritical characters
Дата
Msg-id CAKm4Xs4zKcNYW=-E9C8h_o74xhOrw4miZRK0krya1puEqKAECA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #15548: Unaccent does not remove combining diacriticalcharacters  (Michael Paquier <michael@paquier.xyz>)
Ответы Re: BUG #15548: Unaccent does not remove combining diacritical characters  (Hugh Ranalli <hugh@whtc.ca>)
Список pgsql-hackers
Hi Michael,
The issue was that the python script was working in python 2 but not in python 3 in Windows. This is because the python script writes the final output to stdout and stdout encoding is set to utf-8 only for python 2 but not python 3.If no encoding is set for stdout it takes the encoding from the Operating system.Default encoding in linux and windows might be different.Hence this issue.
Regards,
Ram.

On Tue, 12 Feb 2019 at 09:48, Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Feb 12, 2019 at 02:27:31AM +0530, Ramanarayana wrote:
> I tested the script in python 2.7 and it works perfect. The problem is in
> python 3.7(and may be only in windows as you were not getting the issue)
> and I was getting the following error
>
> UnicodeEncodeError: 'charmap' codec can't encode character '\u0100' in
> position 0: character maps to <undefined>
>
>  I went through the python script and found that the stdout encoding is set
> to utf-8 only  if python version is <=2.
>
> I have made the same change for python version 3 as well. Please find the
> patch for the same.Let me know if it makes sense

Isn't that because Windows encoding becomes cp1252, utf16 or such?
FWIW, on Debian SID with Python 3.7, I get the correct output, and no
diffs on HEAD.  Perhaps it would make sense to use open() on the
different files with encoding='utf-8' to avoid any kind of problems?
--
Michael


--
Cheers
Ram 4.0

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: Too rigorous assert in reorderbuffer.c
Следующее
От: Andreas Karlsson
Дата:
Сообщение: Re: libpq compression