Re: invalid byte sequence for encoding "UTF8": 0xf481 - how could this happen?

Поиск
Список
Период
Сортировка
От Albe Laurenz
Тема Re: invalid byte sequence for encoding "UTF8": 0xf481 - how could this happen?
Дата
Msg-id D960CB61B694CF459DCFB4B0128514C207BB6CE6@exadv11.host.magwien.gv.at
обсуждение исходный текст
Ответ на invalid byte sequence for encoding "UTF8": 0xf481 - how could this happen?  (Rural Hunter <ruralhunter@gmail.com>)
Ответы Re: invalid byte sequence for encoding "UTF8": 0xf481 - how could this happen?  (Rural Hunter <ruralhunter@gmail.com>)
Список pgsql-admin
Rural Hunter wrote:
> My db is in utf-8, I have a row in my table say tmp_article and I
wanted
> to generate ts_vector from the article content:
> select to_tsvector(content) from tmp_article;
> But I got this error:
> ERROR:  invalid byte sequence for encoding "UTF8": 0xf481
>
> I am wondering how this could happen. I think if there was invalid
UTF8
> bytes in the content, it shouldn't have been able to inserted into the
> tmp_article table as I sometimes see similar errors when inserting
> records to tmp_article. Am I right?

You are right in theory.  A lot depends on your PostgreSQL version,
because
the efforts to prevent invalid strings from entering the database have
led to changes over the versions.  Older versions are more permissive.

To test the theory that the contents of the table are bad, you can
test if the same happens if you

SELECT convert_to(content, 'UTF8') FROM tmp_article;

Yours,
Laurenz Albe

В списке pgsql-admin по дате отправления:

Предыдущее
От: Uwe Bartels
Дата:
Сообщение: enterprisedb package and pam on debian
Следующее
От: Chris Ernst
Дата:
Сообщение: Re: Recreate primary key without dropping foreign keys?