Re: Problem with utf8 encoding

Поиск
Список
Период
Сортировка
От Andrew McMillan
Тема Re: Problem with utf8 encoding
Дата
Msg-id 1259832914.8024.823.camel@happy.home.mcmillan.net.nz
обсуждение исходный текст
Ответ на Problem with utf8 encoding  (Jorge Miranda Castañeda <jmirandac.85@gmail.com>)
Ответы Re: Problem with utf8 encoding  (Sylvain Racine <syracine@sympatico.ca>)
Список pgsql-php
On Thu, 2009-12-03 at 02:00 -0500, Jorge Miranda Castañeda wrote:
> Hello everyone!
>
>
> I'm working in a project using postgres, propel, and php.
>
>
> My development environment is:
> SO: Windows vista Business SP2
> Postgres: Postgres v8.4
> Propel: Propel generator/runtime v1.4
> PHP: PHP v5.3
>
>
> Currently I'm struggling with a problem caused by the encoding.
> Everytime I try to insert a row into the table CURRENCY, which has ID,
> DESC, and SYMBOL as its columns, I get the following error:
> Unable to execute INSERT statement. [wrapped: SQLSTATE[22021]:
> Character not in repertoire: 7 ERROR: invalid byte sequence for
> encoding "UTF8": 0x80 HINT: This error can also happen if the byte
> sequence does not match the encoding expected by the server, which is
> controlled by "client_encoding".]
>
>
> I've created the database using this sentence:
> CREATE DATABASE sbs
>   WITH OWNER = sbsadmin
>        ENCODING = 'UTF8'
>        LC_COLLATE = 'Spanish_Peru.1252'
>        LC_CTYPE = 'Spanish_Peru.1252'
>        CONNECTION LIMIT = -1;

Hola Jorge,

I suspect it's the LC_COLLATE and LC_CTYPE that you have there. I don't
*know* this, but they *look* like they are some weird sort of
collation/ctype based on the misguided Windows-1252 encoding.  Sadly,
Windows provides data in this encoding into web forms where the accept
charset is supposedly only ISO-8859.

In Windows-1252 the Euro currency symbol is somewhere in the 0x80 - 0x9f
range - possibly it is 0x80, in fact.

I think you would be better to use a consistent locale like es_PE.UTF-8
though if your data is 1252 encoded then you might need to iconv it
first.

If you have data which is a mix of ISO-8859-1, Windows-1252 and UTF-8
then I can point you at a wee bit of PHP code I wrote which will look at
each character in a string and only iconv from 8859/1252 -> UTF-8 if it
is a high-bit byte which is not part of a valid UTF-8 character already.

The code is here:

 http://repo.or.cz/w/awl.git/blob/HEAD:/inc/AWLUtilities.php

You need both of the last two functions - call the first one during
initialisation, and use the second one to clean the strings.

Cheers,
                    Andrew McMillan.


------------------------------------------------------------------------
andrew (AT) morphoss (DOT) com                            +64(272)DEBIAN
              You will feel hungry again in another hour.
------------------------------------------------------------------------


Вложения

В списке pgsql-php по дате отправления:

Предыдущее
От: Jorge Miranda Castañeda
Дата:
Сообщение: Problem with utf8 encoding
Следующее
От: Sylvain Racine
Дата:
Сообщение: Re: Problem with utf8 encoding