Re: Mac OS: invalid byte sequence for encoding "UTF8"

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Mac OS: invalid byte sequence for encoding "UTF8"
Дата	10 февраля 2016 г. 23:01:04
Msg-id	17166.1455145239@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Mac OS: invalid byte sequence for encoding "UTF8" (Larry Rosenman <ler@lerctr.org>)
Ответы	Re: Mac OS: invalid byte sequence for encoding "UTF8"
Список	pgsql-hackers

Дерево обсуждения

Larry Rosenman <ler@lerctr.org> writes:
> On 2016-02-10 16:19, Tom Lane wrote:
>> I looked into the OS X sources, and found that indeed you are right:
>> *scanf processes the input a byte at a time, and applies isspace() to
>> each byte separately, even when the locale is such that that's a
>> clearly insane thing to do.  Since this code was derived from FreeBSD,
>> FreeBSD has or once had the same issue.  (A look at the freebsd project
>> on github says it still does, assuming that's the authoritative repo.)
>> Not sure about other BSDen.

> Definitive FreeBSD Sources:
> https://svnweb.freebsd.org/base/

Ah, thanks for the link.  I'm not totally sure which branch is most
current, but at least on this one, it's still clearly wrong:
https://svnweb.freebsd.org/base/stable/10/lib/libc/stdio/vfscanf.c?revision=291336&view=markup
convert_string(), which handles %s, applies isspace() to individual bytes
regardless of locale.  convert_wstring(), which handles %ls, does it more
intelligently ... but as I said upthread, relying on %ls would just give
us a different set of portability problems.

It looks like Artur's patch is indeed what we need to do, along with
looking around for other *scanf() uses that are vulnerable.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Mac OS: invalid byte sequence for encoding "UTF8"