Re: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table
Дата
Msg-id 1163.1458594276@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table  (Reece Pegues <RPegues@tripwire.com>)
Ответы Re: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table  (Reece Pegues <RPegues@tripwire.com>)
Список pgsql-bugs
Reece Pegues <RPegues@tripwire.com> writes:
> Looks like the database is created with ENCODING = 'SQL_ASCII'

Basically what that does is defeats all encoding checks inside the
backend; it'll store whatever bytes you give it.  So yeah, substring()
is expected to deal in bytes not characters in this encoding.

> So I assume it was thus saving the data that way, and then if the client
> encoding is utf8 it tried to encode to that and failed?

If client declares its encoding, the backend will verify correct encoding
before transmitting data; but if the database encoding is SQL_ASCII then
no actual conversion happens, only a validity check at transmit/receive.

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Reece Pegues
Дата:
Сообщение: Re: BUG #14038: substring cuts unicode char in half, allowing to save broken utf8 into table
Следующее
От: Daniel Golle
Дата:
Сообщение: Re: BUG #14033: cross-compilation to ARM fails