Re: encoding and LC_COLLATE

Поиск
Список
Период
Сортировка
От LPlateAndy
Тема Re: encoding and LC_COLLATE
Дата
Msg-id 002601cca3b2$7622cc80$62686580$@co.uk
обсуждение исходный текст
Ответ на Re: encoding and LC_COLLATE  ("Mark Watson" <mark.watson@jurisconcept.ca>)
Ответы Re: encoding and LC_COLLATE  (LPlateAndy <andy@centremaps.co.uk>)
Список pgsql-general

Hi Mark (and Adrian),

 

As as update i've now found the same data fails on my postgres 8 which doesn't seem to have the LC_COLLATE etc setting and is just UTF-8 so i guess there is possibly just something about the way the data is getting passed in.

 

This is the error message from postgres 9.0 with the LC_COLLATE as previously described:

 

===============================================

 

ERROR:  invalid byte sequence for encoding "UTF8": 0xe92922
CONTEXT:  COPY pointsofinterest, line 2

 


********** Error **********

 

ERROR: invalid byte sequence for encoding "UTF8": 0xe92922
SQL state: 22021
Context: COPY pointsofinterest, line 2

===============================================

 

 

 

This is the error message from the postgres 8.1 with just UTF-8 set:

 

===============================================

 


ERROR:  invalid UTF-8 byte sequence detected near byte 0xe9
CONTEXT:  COPY pointsofinterest, line 2, column street_name: "Near Café)"

 

===============================================

 

 

Does that help? Is there an easy way to check exactly what encoding an existing piece of data is in?

 

Thanks again for your help so far...

 

Andy

 

 

From: Mark Watson-12 [via PostgreSQL] [mailto:[hidden email]]
Sent: 14 November 2011 20:29
To: LPlateAndy
Subject: Re: encoding and LC_COLLATE

 


De : [hidden email]
[mailto:[hidden email]] De la part de Adrian Klaver
>Envoyé : 14 novembre 2011 13:03
>...
>
>Second is the data coming in actually UTF8 or some other encoding?
>... 

Hi Andy,
I have to agree with Adrian in that the data may be coming in under a
different encoding. An e acute is a valid character in 1252 encoding.
However, if the source computer is using, for example, code page 850, an e
acute is hex(82) whereas the equivalent in 1252 is hex(e9). UTF-8 "doesn't
like" hex(82).
HTH,
Mark


--
Sent via pgsql-general mailing list ([hidden email])
To make changes to your subscription:
click here.
NAML



View this message in context: RE: encoding and LC_COLLATE
Sent from the PostgreSQL - general mailing list archive at Nabble.com.

В списке pgsql-general по дате отправления:

Предыдущее
От: Tarlika Elisabeth Schmitz
Дата:
Сообщение: Re: all non-PK columns from information schema
Следующее
От: Richard Broersma
Дата:
Сообщение: Re: all non-PK columns from information schema