BUG #5532: Valid UTF8 sequence errors as invalid

Поиск
Список
Период
Сортировка
От Michael Lewis
Тема BUG #5532: Valid UTF8 sequence errors as invalid
Дата
Msg-id 201006300842.o5U8gPHY060899@wwwmaster.postgresql.org
обсуждение исходный текст
Ответы Re: BUG #5532: Valid UTF8 sequence errors as invalid
Список pgsql-bugs
The following bug has been logged online:

Bug reference:      5532
Logged by:          Michael Lewis
Email address:      mikelikespie@gmail.com
PostgreSQL version: 9.0 trunk
Operating system:   OS X
Description:        Valid UTF8 sequence errors as invalid
Details:

I'm using Python to sanitize my logs from invalid UTF8 characters before
COPYing them into postgres.  I came across this one sequence that seems to
be valid UTF8 (in the extended range I believe).

It goes through both pythons encoding as well as iconv without an error and
is valid as far as my understanding of UTF8 goes so I am assuming it is a
bug.

Test case:

create table t (v varchar);
insert into t values (E'\xed\xbc\xad');


In bash you can do:

echo -e "\xed\xbc\xad" | iconv -f UTF-8 ; echo $?

to validate it


Thanks,
Mike

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Marcel Asio
Дата:
Сообщение: Re: Function works in 8.4 but not in 9.0 beta2 "ERROR: structure of query does not match function result type"
Следующее
От: "Ola Sergatchov"
Дата:
Сообщение: BUG #5531: REGEXP_ REPLACE causes connection drop