Re: invalid byte sequence for encoding "UTF8"

Поиск
Список
Период
Сортировка
От Albe Laurenz
Тема Re: invalid byte sequence for encoding "UTF8"
Дата
Msg-id D960CB61B694CF459DCFB4B0128514C2A0B0EC@exadv11.host.magwien.gv.at
обсуждение исходный текст
Ответ на invalid byte sequence for encoding "UTF8"  (Glyn Astill <glynastill@yahoo.co.uk>)
Список pgsql-general
Glyn Astill wrote:
> I've setup a postgres 8.2 server and have a database setup with UTF8
> encoding. I intend to read some of our legacy data into the table,
> this legacy data is in ASCII format, and as far as I know is 8 bit
> ASCII.
>
> We have a migration tool from mertechdata.com to convert these files
> that are in a DataFlex format into out postgres tables.

In which format are the data? Text files? SQL statements?
Something binary?

> Some files convert over okay, and some come up with the error message
> 'invalid byte sequence for encoding "UTF8"'. the files that come up
> with the error are created correctly and so are their index's, but as
> soon as we come to insert the data we get this error.

Well, so you claim, but can you prove it?
Do you use a PostgreSQL utility to import the data?
If yes, which tool? What is the exact command line?

> Does anyone know why we're getting this error message? And uis there
> a way to suppress it, or can we get around it using another format?

By "format" I believe that you mean "encoding".
It does not matter what encoding you use as long as the data can
be represented in it, you tell PostgreSQL what the encoding is, and
the data are correct.

There is no advantage of one encoding over the other in this respect.

> Our migration utility does ask us to select the correct encoding for
> our database, and we select UTF8 but we still get the error. What do
> you guys think? Possibly the migration tools fault?

If PostgreSQL says that the data is not UTF-8, we tend to believe it.

To say more, one would need more information.
Can you identify the string about which PostgreSQL complains?
What does it look like?

> I thought we may be able to get around it using SQL_ASCII encoding -
> but it's ony 7 bit, so would we loose some data? Also our conversion
> utility doesn't have the option to use SQL_ASCII.

If you use SQL_ASCII you may succeed in getting the incorrect data into
the database, but that will not make you happy because the data will
not stop being incorrect just because they are in the database.

Yours,
Laurenz Albe

В списке pgsql-general по дате отправления:

Предыдущее
От: "Alexander Staubo"
Дата:
Сообщение: Re: PostgresSQL vs Ingress
Следующее
От: "Trevor Talbot"
Дата:
Сообщение: Re: Linux v.s. Mac OS-X Performance