Re: psql blows up on BOM character sequence

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: psql blows up on BOM character sequence
Дата
Msg-id 24831.1395702319@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: psql blows up on BOM character sequence  (Jim Nasby <jim@nasby.net>)
Ответы Re: psql blows up on BOM character sequence  (Craig Ringer <craig@2ndquadrant.com>)
Список pgsql-hackers
Jim Nasby <jim@nasby.net> writes:
> Wait... I thought that was one of the objections... that we wanted to
> leave a BOM in something like a COPY untouched?

I think most of us are okay with stripping a BOM that appears at the
*beginning* of a text file (assuming there's reason to believe the file
is in UTF8 encoding).  BOM sequences embedded later in the file are a lot
more debatable, and I for one don't want to assume those can be dropped.
I don't know of any legitimate usage of such cases, and think it's
probably better to report an encoding error.

> Uh... could we just treat BOM as another whitespace character?

A BOM is *most certainly not* whitespace.  The only even semi-legitimate
usage it has in UTF8 is as a file encoding marker.  You can bet that the
user whose text editor made the file did not think he had whitespace at
the front.  Anyway, your proposition that leading whitespace is ignorable
fails completely for data files.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jim Nasby
Дата:
Сообщение: Re: psql blows up on BOM character sequence
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Only first XLogRecData is visible to rm_desc with WAL_DEBUG