Re: UTF8 with BOM support in psql

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: UTF8 with BOM support in psql
Дата
Msg-id 29075.1258473002@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: UTF8 with BOM support in psql  (Peter Eisentraut <peter_e@gmx.net>)
Список pgsql-hackers
Peter Eisentraut <peter_e@gmx.net> writes:
> I think I could support using the presence of the BOM as a fall-back
> indicator of encoding in absence of any other declaration.  It seems to
> me, however, that the description above ignores the existence of
> encodings other than SQL_ASCII and UTF8.

Yeah.  This entire proposal rests on the assumption that UTF8 is the
only encoding that really matters, and introducing a possibility of
breaking things for users of other encodings is acceptable damage.
I do not think that supporting a deprecated-by-standards behavior
is worth that.

Even assuming that we had consensus on a behavior that involved
silently changing client_encoding, I do not believe that it's practical
to implement it in an acceptable fashion.  Just issuing a SET behind the
user's back will not work in a number of scenarios:

* We are inside a transaction when \i is called, and the file contains
a ROLLBACK.

* We are inside a failed transaction when \i is called --- the SET won't
even work at all.

* Same two cases inside a savepoint.

* The file contains a \c command.

If you expect that the previous client_encoding should be restored at
the end of the \i inclusion (as I certainly would) then you have the
first three hazards at file end as well, except that now the odds of
being inside a failed transaction are significantly higher.  Also,
what if the file contained a SET CLIENT_ENCODING command itself?
How should that interact with this?

Lastly, a silent change of client_encoding would also affect the
encoding of notice and error messages that come out while the \i
file is running.  I fail to find that non-astonishing, either.

I think that the only way this sort of behavior could be implemented
without a bunch of broken corner cases would be if we put the
responsibility of encoding conversion inside psql, so that switching its
idea of the encoding was just a local change rather than something it
had to ask the backend to do, and it could be careful to apply the
encoding only to the data coming from the \i file.  Which is possible,
perhaps, but it hardly seems that slightly-more-convenient BOM handling
is worth it.
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Timezones (in 8.5?)
Следующее
От: Alex Hunsaker
Дата:
Сообщение: Re: Writeable CTE patch