Обсуждение: Weird encoding behavior
Hi, I'm experiencing a weird behavior when storing latin characters to a PostgreSQL 8.3.1.876 server. The database is Latin1 encoded, and it is working since September 2008, it wasn't updated nor replaced since its first installation. The weirdness of the problem is that sometimes the characters are stored correctly, but sometimes doesn't (allways by the same program), the field type is Varchar(30), and for example the text "NUÑEZ" is stored as "NU?EZ". The data comes from an external application in an XML file (also Latin1), then, a Delphi service parses the XML and create the Insert/Update statements to store the data in the database. I'd try to reproduce the bug by sending XML files with 'Ñ' to the service, but it is stored correctly. Also, there's a front end that allows users to see/edit the data in a user friendlier way. Again, I checked by inserting records with 'Ñ' using this front-end, and also are stored correctly. Does anyone faced the same problem? any workaround? Thanks in advance, Leonardo
Hola
use Ñ for spanish N
http://webdesign.about.com/od/localization/l/blhtmlcodes-sp.htm
Saludos Cordiales desde EEUU!
Martin
______________________________________________
Disclaimer and confidentiality note
> From: martinrame@yahoo.com
> Subject: [GENERAL] Weird encoding behavior
> Date: Wed, 25 Mar 2009 15:52:55 -0300
> To: pgsql-general@postgresql.org
>
> Hi, I'm experiencing a weird behavior when storing latin characters to a
> PostgreSQL 8.3.1.876 server. The database is Latin1 encoded, and it is
> working since September 2008, it wasn't updated nor replaced since its
> first installation.
>
> The weirdness of the problem is that sometimes the characters are stored
> correctly, but sometimes doesn't (allways by the same program), the
> field type is Varchar(30), and for example the text "NUÑEZ" is stored as
> "NU?EZ".
>
> The data comes from an external application in an XML file (also
> Latin1), then, a Delphi service parses the XML and create the
> Insert/Update statements to store the data in the database. I'd try to
> reproduce the bug by sending XML files with 'Ñ' to the service, but it
> is stored correctly.
>
> Also, there's a front end that allows users to see/edit the data in a
> user friendlier way. Again, I checked by inserting records with 'Ñ'
> using this front-end, and also are stored correctly.
>
> Does anyone faced the same problem? any workaround?
>
> Thanks in advance,
> Leonardo
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Internet Explorer 8 – Now Available. Faster, safer, easier. Download FREE now!
use Ñ for spanish N
http://webdesign.about.com/od/localization/l/blhtmlcodes-sp.htm
Saludos Cordiales desde EEUU!
Martin
______________________________________________
Disclaimer and confidentiality note
This message is confidential and may be privileged. If you are not the intended recipient, we kindly ask you to please inform the sender. Any unauthorised dissemination or copying hereof is prohibited. This message serves for information purposes only and shall not have any legally binding effect. Given that e-mails can easily be subject to manipulation, we can not accept any liability for the content provided.
> From: martinrame@yahoo.com
> Subject: [GENERAL] Weird encoding behavior
> Date: Wed, 25 Mar 2009 15:52:55 -0300
> To: pgsql-general@postgresql.org
>
> Hi, I'm experiencing a weird behavior when storing latin characters to a
> PostgreSQL 8.3.1.876 server. The database is Latin1 encoded, and it is
> working since September 2008, it wasn't updated nor replaced since its
> first installation.
>
> The weirdness of the problem is that sometimes the characters are stored
> correctly, but sometimes doesn't (allways by the same program), the
> field type is Varchar(30), and for example the text "NUÑEZ" is stored as
> "NU?EZ".
>
> The data comes from an external application in an XML file (also
> Latin1), then, a Delphi service parses the XML and create the
> Insert/Update statements to store the data in the database. I'd try to
> reproduce the bug by sending XML files with 'Ñ' to the service, but it
> is stored correctly.
>
> Also, there's a front end that allows users to see/edit the data in a
> user friendlier way. Again, I checked by inserting records with 'Ñ'
> using this front-end, and also are stored correctly.
>
> Does anyone faced the same problem? any workaround?
>
> Thanks in advance,
> Leonardo
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
Internet Explorer 8 – Now Available. Faster, safer, easier. Download FREE now!
Leonardo M. Ramé wrote: > Hi, I'm experiencing a weird behavior when storing latin characters to a > PostgreSQL 8.3.1.876 server. The database is Latin1 encoded, and it is > working since September 2008, it wasn't updated nor replaced since its > first installation. > > The weirdness of the problem is that sometimes the characters are stored > correctly, but sometimes doesn't (allways by the same program), the > field type is Varchar(30), and for example the text "NUÑEZ" is stored as > "NU?EZ". > > The data comes from an external application in an XML file (also > Latin1), then, a Delphi service parses the XML and create the > Insert/Update statements to store the data in the database. I'd try to > reproduce the bug by sending XML files with 'Ñ' to the service, but it > is stored correctly. > > Also, there's a front end that allows users to see/edit the data in a > user friendlier way. Again, I checked by inserting records with 'Ñ' > using this front-end, and also are stored correctly. > > Does anyone faced the same problem? any workaround? Well, there *must* be one client that stores wrong data... As a first step, can you find out the code point of the character that is represented as "?" in your E-Mail? Something like SELECT ascii(substr('NU?EZ', 3, 1)); except that instead of the string literal you substitute the column containing the bad value. Yours, Laurenz Albe
> Well, there *must* be one client that stores wrong data... You are right, I'll ask someone in site to look at *each* client hunting for the root of the problem. It must be a Windows Regional Settings or something similar. > > As a first step, can you find out the code point of the character that > is represented as "?" in your E-Mail? > > Something like > SELECT ascii(substr('NU?EZ', 3, 1)); > except that instead of the string literal you substitute the column > containing the bad value. > > Yours, > Laurenz Albe > I did what you suggested, and it responds with a 63 when the string is "NU?NEZ" and 209 when it's "NUÑEZ". Thanks, Leonardo.
"Leonardo M. Ramé" wrote: > Hi, I'm experiencing a weird behavior when storing latin characters to a > PostgreSQL 8.3.1.876 server. The database is Latin1 encoded, and it is > working since September 2008, it wasn't updated nor replaced since its > first installation. > > The weirdness of the problem is that sometimes the characters are stored > correctly, but sometimes doesn't (allways by the same program), the > field type is Varchar(30), and for example the text "NUÑEZ" is stored as > "NU?EZ". You need to be sure that the client_encoding variable is set to whatever the client is actually using. Some client programs may be using UTF8, and thus they should do "SET client_encoding TO utf8" on connection start; others may be using Win-1252, Latin1, or other encodings. If you declare each correctly when the connection is established, the server will do the right conversion for you automatically. If you fail to do that you will have a mess of characters, just like you do now. BTW it is very advisable to update to 8.3.7. It has a lot of bug fixes since 8.3.1. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
Leonardo M. Ramé wrote: > > As a first step, can you find out the code point of the character that > > is represented as "?" in your E-Mail? > > > > Something like > > SELECT ascii(substr('NU?EZ', 3, 1)); > > except that instead of the string literal you substitute the column > > containing the bad value. > > I did what you suggested, and it responds with a 63 when the string is > "NU?NEZ" and 209 when it's "NUÑEZ". 63 is indeed a question mark. Since such a conversion would not be done by PostgreSQL, "something else" must convert Ñ to ?N *before* it gets into PostgreSQL... Yours, Laurenz Albe
"Albe Laurenz" <laurenz.albe@wien.gv.at> writes: > Leonardo M. Ram� wrote: >> I did what you suggested, and it responds with a 63 when the string is >> "NU?NEZ" and 209 when it's "NU�EZ". > 63 is indeed a question mark. Since such a conversion would not be > done by PostgreSQL, "something else" must convert � to ?N *before* > it gets into PostgreSQL... Yeah, I think this destroys the theory that it's due to a wrong choice of client_encoding setting. What you'd be likely to get from that is a "character can't be translated" kind of error, not silent substitution of a question mark. The damage must be getting done on the client side. regards, tom lane