Re: change database encoding without corrupting data (latin9 to utf8)

Поиск
Список
Период
Сортировка
От Michael Fuhr
Тема Re: change database encoding without corrupting data (latin9 to utf8)
Дата
Msg-id 20070519212338.GA65421@winnie.fuhr.org
обсуждение исходный текст
Ответ на change database encoding without corrupting data (latin9 to utf8)  (filippo <filippo2991@virgilio.it>)
Список pgsql-general
On Fri, May 18, 2007 at 02:46:26AM -0700, filippo wrote:
> I have a problem to entry data to postgres database (latin9) from my
> perl/tk application running on windows (utf8). Whenever I try to entry
> letter with accents, these looks corrupted once stored into database.
>
> A workaround is to set client encoding to UTF8 after creating the
> database connection:
>
> $dbh->do(qq/SET client_encoding to 'UTF8'/);

"Workaround" has a negative connotation that's perhaps misused in
this case because setting client_encoding is the proper way of
telling the database what the client's encoding is.  If the connecting
role will always use UTF8 then you could use ALTER ROLE (or ALTER
USER in 8.0 and earlier) to automatically set client_encoding for
every connection:

ALTER ROLE rolename SET client_encoding TO 'UTF8';

> To avoid such kind of workaround I'd like to convert the whole
> database from LATIN9 to UTF8, how can I do it without corrupting the
> data?

If all of the data is uncorrupted LATIN9 then you could use pg_dump
to dump the LATIN9 database and then restore it into a UTF8 database.
But if you have a mix of uncorrupted and corrupted characters (UTF8
byte sequences stored as LATIN9) then you have a bit of a problem
because some data needs to be converted from LATIN9 to UTF8 but
other data is already UTF8 and shouldn't be converted.

--
Michael Fuhr

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Performance tuning?
Следующее
От: Robert Fitzpatrick
Дата:
Сообщение: Re: Performance tuning?