Re: Mac OS: invalid byte sequence for encoding "UTF8"

Поиск
Список
Период
Сортировка
От Teodor Sigaev
Тема Re: Mac OS: invalid byte sequence for encoding "UTF8"
Дата
Msg-id 56BB5C84.8060106@sigaev.ru
обсуждение исходный текст
Ответ на Re: Mac OS: invalid byte sequence for encoding "UTF8"  (Artur Zakirov <a.zakirov@postgrespro.ru>)
Ответы Re: Mac OS: invalid byte sequence for encoding "UTF8"  (Artur Zakirov <a.zakirov@postgrespro.ru>)
Список pgsql-hackers
> It seems that *scanf() with %s format occures only here:
> - check.c - get_bin_version()
> - server.c - get_major_server_version()
> - filemap.c - isRelDataFile()
> - pg_backup_directory.c - _LoadBlobs()
> - xlog.c - do_pg_stop_backup()
> - mac.c - macaddr_in()
> I think here sscanf() do not works with the UTF-8 characters. And probably this
> is only spell.c issue.

Hmm. Here
src/backend/access/transam/xlog.c read_tablespace_map()
using %s in scanf looks suspisious. I don't fully understand but it looks like 
it tries to read oid as string. So, it should be safe in usial case

Next, _LoadBlobs() reads filename (fname) with a help of sscanf. Could file name 
be in UTF-8 encoding here?

>
> I agree that previous patch is wrong. Instead of using new parse_ooaffentry()
> function maybe better to use sscanf() with %ls format. The %ls format is used to
> read a wide character string.
Does %ls modifier exist everewhere?
Apple docs says 
(https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/sscanf.3.html):
s ...  If an l qualifier is present, the next pointer must be a pointer to wchar_t,  into which the input will be
placedafter conversion by mbrtowc
 

Actually, it means that wchar2char() call should be used, but it uses  wcstombs[_l] which could do not present on some
platforms.Does it mean that 
 
l modifier of string presents too or not? What do we need to do if %l exists but 
wcstombs[_l] not?

I'm a bit crazy with locale problems and it seems to me that Artur's patch is 
good idea. Actually, I don't remember exactly, but, seems, commit 
7ac8a4be8946c11d5a6bf91bb971b9750c1c60e5 introduced parse_affentry() instead of 
corresponding sscanf to avoid problems with encoding and scanf.




-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
  WWW: http://www.sigaev.ru/
 



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Magnus Hagander
Дата:
Сообщение: Re: Updated backup APIs for non-exclusive backups
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Tracing down buildfarm "postmaster does not shut down" failures