Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows

Поиск
Список
Период
Сортировка
От Meetesh Karia
Тема Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows
Дата
Msg-id 47D7DC8C.7000204@gmail.com
обсуждение исходный текст
Ответ на Encoding problems with migration from 8.0.14 to 8.3.0 on Windows  (Meetesh Karia <meetesh.karia@gmail.com>)
Ответы Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows  (Robert Treat <xzilla@users.sourceforge.net>)
Список pgsql-admin
One quick addition to this:

The column I'm creating this unique index on is a varchar(255) and the command I was running was:

create unique index foo_name on foo (name);

If I use the following, it now works:

create unique index foo_name on foo (cast(name as bytea));

Thoughts?

Meetesh

Meetesh Karia wrote:
Hi all,

I'm trying to migrate from 8.0.14 on Windows (Vista Home Premium) to 8.3.0 and I've been trying to solve what appears to be an encoding problem.  My old db was in the UNICODE encoding.  I know that this isn't supported on 8.0.x, but it was a restore of a db from a Linux environment and postgres didn't appear to have any problems with it.

My 8.3 server and client encodings are UTF8 and I used pg_dumpall (I tried the 8.0 and 8.3 versions) to dump the db.  However, when I tried to restore the db, I got an error during index creation which wouldn't let me create a unique index on a column that had all unique values (it had the index in 8.0 and a group by having query with no indexes on the table confirms uniqueness).  The thing that this column does have however is values like:

'Bruehl'
'Brühl'

I created a blank table with the unique index on it and inserted rows one at a time until I confirmed that it was the above values that were causing a problem.  Running the following query shows the difference in the hex encoded values (I changed my client encoding to WIN1250 to get the below to show up correctly):

select name, encode(decode(name, 'escape'), 'hex') from ...

     name      |           encode
---------------+----------------------------
 Daniel Brühl  | 44616e69656c204272c3bc686c
 Daniel Bruehl | 44616e69656c2042727565686c
(2 rows)

I've also tried exporting using an encoding of WIN1250 but I get errors like this:

pg_dump: Error message from server: ERROR:  character 0xc383 of encoding "UNICODE" has no equivalent in "WIN1250"

Anyone have any thoughts or suggestions?  Why would the index creation fail?  Is there a workaround?

Thanks,
Meetesh

В списке pgsql-admin по дате отправления:

Предыдущее
От: Meetesh Karia
Дата:
Сообщение: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows
Следующее
От: Tom Lane
Дата:
Сообщение: Re: No initdb in Fedora 8