Обсуждение: Problems importing Unicode

Поиск

Список

Период

Сортировка

Problems importing Unicode

От

matthias@cmklein.de

Дата:

17 ноября 2004 г., 03:58:47

I have batch files with entries such as

INSERT INTO country VALUES (248,'ALA','AX','Åland Islands');
INSERT INTO country VALUES (384,'CIV','CI','Côte d\'Ivoire');

I tried to execute them using "pgsql \i filename.sql"

Unfortunately, I keep getting an error message:
"ERROR:  invalid byte sequence for encoding "UNICODE": 0xc56c"

How can that be possible?
My database is set to encoding "UNICODE" and so are the batchfiles.

Why does that not work?

Thanks

Matt

Re: Problems importing Unicode

От

Tatsuo Ishii

Дата:

17 ноября 2004 г., 04:25:37

> I have batch files with entries such as
>
> INSERT INTO country VALUES (248,'ALA','AX','Åland Islands');
> INSERT INTO country VALUES (384,'CIV','CI','Côte d\'Ivoire');
>
> I tried to execute them using "pgsql \i filename.sql"
>
> Unfortunately, I keep getting an error message:
> "ERROR:  invalid byte sequence for encoding "UNICODE": 0xc56c"
>
> How can that be possible?
> My database is set to encoding "UNICODE" and so are the batchfiles.
>
> Why does that not work?

I bet your batch file is not encoded in UNICODE (UTF-8).
--
Tatsuo Ishii

Re: Problems importing Unicode

От

matthias@cmklein.de

Дата:

17 ноября 2004 г., 07:44:06

Well, they were generated by MySQL and I can open them with e.g. the
Windows Editor Notepad. But I don't know if they are actually encoded in
UNICODE.
Since I can open the file with Notepad and read the statements, I assume,
it is not UNICODE. They look just like in the email below.

The problem are apparently those characters Å or ô and I really would like
to know how to import those files into PostgreSQL 8.0.0

Is there a switch I can use to do a codepage / encoding translation?

Why are MS Access or even MySQL able to read those files without trouble
but PostgreSQL reports an error?

Thanks

Matt



--- Ursprüngliche Nachricht ---
Datum: 17.11.2004 02:25
Von: Tatsuo Ishii <t-ishii@sra.co.jp>
An: matthias@cmklein.de
Betreff: Re: [GENERAL] Problems importing Unicode

> > I have batch files with entries such as
> >
> > INSERT INTO country VALUES (248,'ALA','AX','Åland Islands');
> > INSERT INTO country VALUES (384,'CIV','CI','Côte d\'Ivoire');
> >
> > I tried to execute them using "pgsql \i filename.sql"
> >
> > Unfortunately, I keep getting an error message:
> > "ERROR:  invalid byte sequence for encoding "UNICODE": 0xc56c"
> >
> > How can that be possible?
> > My database is set to encoding "UNICODE" and so are the batchfiles.
> >
> > Why does that not work?
>
> I bet your batch file is not encoded in UNICODE (UTF-8).
> --
> Tatsuo Ishii
>

Re: Problems importing Unicode

От

Richard Huxton

Дата:

17 ноября 2004 г., 12:20:31

matthias@cmklein.de wrote:
> Well, they were generated by MySQL and I can open them with e.g. the
> Windows Editor Notepad. But I don't know if they are actually encoded in
> UNICODE.
> Since I can open the file with Notepad and read the statements, I assume,
> it is not UNICODE. They look just like in the email below.

Probably some WINxxx encoding. I've seen something similar with data
from MS-Access.

> The problem are apparently those characters Å or ô and I really would like
> to know how to import those files into PostgreSQL 8.0.0
>
> Is there a switch I can use to do a codepage / encoding translation?
>
> Why are MS Access or even MySQL able to read those files without trouble
> but PostgreSQL reports an error?

Because they're using the same WIN locale details. What you might want
to try is to set your client encoding at the top of the batch file and
see if PostgreSQL can't convert it for you.

SET CLIENT_ENCODING = WIN1250;

There's a list of encodings PG can convert for you in the manual (see
the chapter "Automatic Character Set Conversion Between Server and
Client" in the Localization section.

--
   Richard Huxton
   Archonet Ltd

Re: Problems importing Unicode

От

"Magnus Hagander"

Дата:

17 ноября 2004 г., 14:53:09

> Well, they were generated by MySQL and I can open them with
> e.g. the Windows Editor Notepad. But I don't know if they are
> actually encoded in UNICODE.
> Since I can open the file with Notepad and read the
> statements, I assume, it is not UNICODE. They look just like
> in the email below.

Windows Notepad handles Unicode just fine, both UTF-16 (labeled Unicode
in notepad) and UTF-8 (labeled UTF-8).
To test, open the file in Notepad, then do "File->Save As". The
"Encoding" dropdown box will default to whatever Notepad detected when
it opened the file. If it's UTF-16 and you need UTF-8, just change the
encoding and save under a different name.

//Magnus

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Problems importing Unicode

Problems importing Unicode

Re: Problems importing Unicode

Re: Problems importing Unicode

Re: Problems importing Unicode

Re: Problems importing Unicode