Re: Using psql -f to load a UTF8 file

Поиск
Список
Период
Сортировка
От Chris Angelico
Тема Re: Using psql -f to load a UTF8 file
Дата
Msg-id CAPTjJmrx3Njx30=F9indfZZ5_8v5xfWsWZqD2aLiLLXmu78O_w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Using psql -f to load a UTF8 file  (Craig Ringer <ringerc@ringerc.id.au>)
Список pgsql-general
On Fri, Sep 21, 2012 at 11:21 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
> I strongly disagree. The BOM provides a useful and standard way to
> differentiate UTF-8 encoded text files from the random pile of encodings
> that any given file could be.

The only reliable way to ascertain the encoding of a hunk of data is
with something out-of-band. Relying on the first three bytes being
\xEF\xBB\xBF is not much more reliable than detecting based on octet
frequency, which is what leads to the "Bush hid the facts" hack in
Notepad. This is why many Internet protocols have metadata carried
along with the file (eg Content-type in HTTP), rather than relying on
internal evidence.

> psql should accept UTF-8 with BOM.

However, this I would agree with. It's cheap enough to detect, and
aside from arbitrarily trying to kill Notepad (which won't happen
anyway), there's not a lot of reason to choke on the BOM. But it's not
a big deal.

ChrisA


В списке pgsql-general по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: Using psql -f to load a UTF8 file
Следующее
От: Benedikt Grundmann
Дата:
Сообщение: Expression to construct a anonymous record with named columns?