Re: pg_dump, pg_restore and UTF8: invalid byte sequence

Поиск
Список
Период
Сортировка
От
Тема Re: pg_dump, pg_restore and UTF8: invalid byte sequence
Дата
Msg-id 054401c6f19b$8e5c8210$6501a8c0@iwing
обсуждение исходный текст
Ответ на pg_dump, pg_restore and UTF8: invalid byte sequence  (<me@alternize.com>)
Список pgsql-novice
> shouldn't pg_dump encode the utf8 bytesequences?

at least i found out why the invalid unicode sequences appear in the first
place: tsearch2 in 8.1 doesn't properly handle utf8 characters: the
character's 2-byte representation is converted to lowercase byte for byte.
for example: "ä" which is encoded as "ä" is written to the db by tsearch2
as "ã¤" which is an invalid utf8 byte sequence.

striping the ts2 index columb before dumping fixes the encoding problems. i
guess the 8.2 -> 8.1.5 backport should fix it as well, i'll try asap.

> also, regarding pg_restore, its quite troubling it has the same
> parameter-set as pg_dump

never mind this, it is too late in the evening 8-)

- thomas



В списке pgsql-novice по дате отправления:

Предыдущее
От:
Дата:
Сообщение: pg_dump, pg_restore and UTF8: invalid byte sequence
Следующее
От: Yadnyesh Joshi
Дата:
Сообщение: Inserting arrays from C program