Latin1 to UTF-8 ?

Поиск
Список
Период
Сортировка
От Aarni Ruuhimäki
Тема Latin1 to UTF-8 ?
Дата
Msg-id 200708031537.20276.aarni@kymi.com
обсуждение исходный текст
Ответы Re: Latin1 to UTF-8 ?
Список pgsql-general
Hi,

I've set up a new CentOs server with PostgreSQL 8.2.4 and initdb'ed it with
UTF-8.

Ok, and runs fine.

I have a problem with encodings, however. And mainly with the russian cyrillic
characters.

When I testdumped some dbs from the old FC / Pg 8.0.2, all Latin1, I noticed
that some of the dumps show in the Konqueror file browser as 'Plain Text
Documents' and some as 'C++ Source Files'. Both have Latin1 as client
encoding at the top of the files. Changing that gives errors, as expected.

Looking in to the plain text dumps I see all cyrillic characters as Р...
and these go in display fine from the new server's UTF-8 environment.

Some of the 'C++' files have the cyrillics as 'îñåòèòåëåé'. Some have both
'îñåòèòåëåé' and Р... and ofcourse the 'îñåò' characters come out wrong
and unreadable to the browser. (not sure if you an see single quoted ones,
but they look something like hebrew or similar)

I have no idea what browsers / encodings or even keyboard layouts have been
used when the data has been inserted by users through their web
interfaces ...

I tried the -F p switch as the earlier version has no -E for dumps. Same
output. Also with pg_dumpall.

I tried various encodings with iconv too.

So, what would be the proper way to convert the dumps to UTF-8 ? Or any other
solution ? Any other tool to work with the problem files ?

BR,

Aarni
--
Aarni Ruuhimäki


В списке pgsql-general по дате отправления:

Предыдущее
От: "Gavin M. Roy"
Дата:
Сообщение: Re: What do people like to monitor (or in other words, what might be nice in pgsnmpd)?
Следующее
От: Devrim GÜNDÜZ
Дата:
Сообщение: Re: Suse RPM's