Re: Support UTF-8 files with BOM in COPY FROM

Поиск
Список
Период
Сортировка
От Brar Piening
Тема Re: Support UTF-8 files with BOM in COPY FROM
Дата
Msg-id 4E80CB15.10706@gmx.de
обсуждение исходный текст
Ответ на Re: Support UTF-8 files with BOM in COPY FROM  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Tom Lane wrote:
> Putting a BOM into UTF8 data is flat out invalid per spec --- the fact
> that Microsloth does it does not make it standards-conformant.

Could you share a pointer to the spec?
All I've ever heard is that a BOM is optional for UTF-8 but not forbidden.

The Unicode FAQ (http://unicode.org/faq/utf_bom.html#BOM) states "that 
some recipients of UTF-8 encoded data do not expect a BOM".
Postgres obviously belongs to those recipients.
That's why all my psql-scripts transferring data from MSSQL to Postgres 
need a '\! perl -CD -pi.orig -e "tr/\x{feff}//d" "C:/datafile.txt"' 
before feeding data into COPY TO.

Reading it tolerantly and writing it on user request is probably the way 
that would help most users.

Regards,

Brar



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Support UTF-8 files with BOM in COPY FROM
Следующее
От: Brar Piening
Дата:
Сообщение: Re: Support UTF-8 files with BOM in COPY FROM