Re: [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding
Дата	19 февраля 2014 г. 00:49:10
Msg-id	12131.1392760137@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding (Tom Lane <tgl@sss.pgh.pa.us>) Re: [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding (Tom Lane <tgl@sss.pgh.pa.us>)
Список	pgsql-hackers

Дерево обсуждения

I wrote:
> digoal@126.com writes:
>> select t, t::bytea from convert_from('\xeec1', 'sql_ascii') as g(t);
>> [ fails to check that string is valid in database encoding ]

> Hm, yeah.  Normal input to the database goes through pg_any_to_server(),
> which will apply a validation step if the source encoding is SQL_ASCII
> and the destination encoding is something else.  However, pg_convert and
> some other places call pg_do_encoding_conversion() directly, and that
> function will just quietly do nothing if either encoding is SQL_ASCII.

> The minimum-refactoring solution to this would be to tweak
> pg_do_encoding_conversion() so that if the src_encoding is SQL_ASCII but
> the dest_encoding isn't, it does pg_verify_mbstr() rather than nothing.

> I'm not sure if this would break anything we need to have work,
> though.  Thoughts?  Do we want to back-patch such a change?

I looked through all the callers of pg_do_encoding_conversion(), and
AFAICS this change is a good idea.  There are a whole bunch of places
that use pg_do_encoding_conversion() to convert from the database encoding
to encoding X (most usually UTF8), and right now if you do that in a
SQL_ASCII database you have no assurance whatever that what is produced
is actually valid in encoding X.  I think we need to close that loophole.

I found one place --- utf_u2e() in plperl_helpers.h --- that is aware of
the lack of checking and forces a pg_verify_mbstr call for itself; but
it apparently is concerned about whether the source data is actually utf8
in the first place, which I think is not really
pg_do_encoding_conversion's bailiwick.  I'm okay with
pg_do_encoding_conversion being a no-op if src_encoding == dest_encoding.

Barring objections, I will fix and back-patch this.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Jeff Janes
Дата: 19 февраля 2014 г., 00:30:30
Сообщение: Re: Do you know the reason for increased max latency due to xlog scaling?

Следующее

От: Tom Lane
Дата: 19 февраля 2014 г., 02:32:07
Сообщение: Re: [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

Предыдущее

Следующее