Обсуждение: UTF8
Hi ! I get the following exception when I read a simple TXT file in Linux and try to INSERT to the psql. (8.1.4) org.postgresql.util.PSQLException: ERROR: character 0xefbfbd of encoding "UTF8" has no equivalent in "LATIN2" at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1512) at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1297) at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:188) Can someone help me ? Saca
Hi, Bakos, Bakos Sandor wrote: > I get the following exception when I read a simple TXT file in Linux and > try to INSERT to the psql. (8.1.4) > > org.postgresql.util.PSQLException: ERROR: character 0xefbfbd of encoding > "UTF8" has no equivalent in "LATIN2" This meas that your database is encoded in ISO-LATIN2 charset, and psql is telling the server the data it sends is UTF-8. The server tries to convert the UTF-8 Data into LATIN2, but there is a character (whose UTF8-Sequence is 0xefbfbd) that is not contained in LATIN-2. Either your file is latin-2 in reality (or even another charset), then you should tell psql to use the latin-2 encoding. Or your file really is utf-8, and really contains characters not contained in latin-2. Then you have two possibilities: Edit the file and replace those characters with some transcription, or convert your database to utf-8 encoding (needs a dump&restore). HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Markus Schaber wrote: > Hi, Bakos, > > Bakos Sandor wrote: > > >>I get the following exception when I read a simple TXT file in Linux and >>try to INSERT to the psql. (8.1.4) >> >>org.postgresql.util.PSQLException: ERROR: character 0xefbfbd of encoding >>"UTF8" has no equivalent in "LATIN2" > > > This meas that your database is encoded in ISO-LATIN2 charset, and psql > is telling the server the data it sends is UTF-8. The server tries to > convert the UTF-8 Data into LATIN2, but there is a character (whose > UTF8-Sequence is 0xefbfbd) that is not contained in LATIN-2. > > Either your file is latin-2 in reality (or even another charset), then > you should tell psql to use the latin-2 encoding. > > Or your file really is utf-8, and really contains characters not > contained in latin-2. Then you have two possibilities: Edit the file and > replace those characters with some transcription, or convert your > database to utf-8 encoding (needs a dump&restore). Actually, given that that's a Java JDBC exception, there's no 'psql' client involved at all. The JDBC driver always uses UTF8 as the client encoding since that maps easily from the native Java string representation (UCS2) and every possible Java String can be represented in UTF8. Of course, not every possible Java string can be represented as LATIN2, which is the cause of the error. I would guess that the problem is probably that when *reading* the text file originally, the wrong encoding is being used to convert the bytes to Java Strings. If you don't use the right encoding here, then the Java String you end up with will be garbage. -O
Hi, Oliver, Oliver Jowett wrote: > Actually, given that that's a Java JDBC exception, there's no 'psql' > client involved at all. Yes, you're right. So I see two possibilities: - The input encoding when reading the file into java is wrong. - The file really contains characters that are not contained in LATIN-2. HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Hi, Bakos, Please keep the discussion on the list, so others can help or, by reading the archives, learn. Bakos Sandor wrote: > I dont understand because we have a java application which work about a > year. > Yesterday we chancge the psql version from 7.4 to 8.1.4 and we get this > exception. Ah, it is an interesting information that you updated your system, you did not tell us about this before. I can see two possible reasons for this: - The database encoding changed during the upgrade. (was your old database encoded in ASCII or utf8?) - You update the driver as well (newer pgjdbc drivers are more strict wr/t client encodings). HTH, Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Oliver Jowett <oliver@opencloud.com> writes: > Markus Schaber wrote: >> Hi, Bakos, >> Bakos Sandor wrote: >> >>>I get the following exception when I read a simple TXT file in Linux and >>>try to INSERT to the psql. (8.1.4) >>> >>>org.postgresql.util.PSQLException: ERROR: character 0xefbfbd of encoding >>>"UTF8" has no equivalent in "LATIN2" > I would guess that the problem is probably that when *reading* the > text file originally, the wrong encoding is being used to convert the > bytes to Java Strings. If you don't use the right encoding here, then > the Java String you end up with will be garbage. Very likely since 0xefbfbd is the... unicode "replacement character" http://www.fileformat.info/info/unicode/char/fffd/index.htm Try printing this file from Java for debugguing.
Hi ! I set the character encoding in the InputStreamReader in my program and it seem this is resolve my problem. So thx for all the help. Saca Marc Herbert wrote: >Oliver Jowett <oliver@opencloud.com> writes: > > >>Markus Schaber wrote: >> >>>Hi, Bakos, >>>Bakos Sandor wrote: >>> >>> >>>>I get the following exception when I read a simple TXT file in Linux and >>>>try to INSERT to the psql. (8.1.4) >>>> >>>>org.postgresql.util.PSQLException: ERROR: character 0xefbfbd of encoding >>>>"UTF8" has no equivalent in "LATIN2" >>>> > > > >>I would guess that the problem is probably that when *reading* the >>text file originally, the wrong encoding is being used to convert the >>bytes to Java Strings. If you don't use the right encoding here, then >>the Java String you end up with will be garbage. >> > >Very likely since 0xefbfbd is the... unicode "replacement character" > > http://www.fileformat.info/info/unicode/char/fffd/index.htm > >Try printing this file from Java for debugguing. > > > > >---------------------------(end of broadcast)--------------------------- >TIP 2: Don't 'kill -9' the postmaster > > >