Обсуждение: Data corruption with BYTEA and SQL_ASCII encoding

Поиск
Список
Период
Сортировка

Data corruption with BYTEA and SQL_ASCII encoding

От
Marcus Better
Дата:
Hi,

I am using PostgreSQL 7.1.3 with the latest (7.2) development JDBC
driver.  My tables contain binary data in BYTEA columns.  I get
strange errors when I read the data using getBytes() if my database
has SQL_ASCII default encoding.

The data I get has the correct length, but some characters (I believe
0xa0 and higher) are replaced with 0xfd characters.

I traced the problem to the getBytes method in
org/postgresql/jdbc2/ResultSet.java in the JDBC driver:

  //Version 7.2 supports the bytea datatype for byte arrays
  if (fields[columnIndex - 1].getPGType().equals("bytea"))
  {
      return PGbytea.toBytes(getString(columnIndex));
  }


I checked the actual contents of the column that is returned from the
database, and it is a string which contains non-ascii characters, like
this:

   \012¿_Ãeo7\223\2316#Ph©\021ê\217\212åI\217k·h:"\230ÜÔ\034ÅW

This string agrees with the contents of the database.  I also checked
that PGbytea.toBytes() translates this string correctly.  So this
leaves the call to getString().

getString() tries to decode the string using the specified default
encoding of the database (SQL_ASCII), and this indeed gives the
erroneous results.

It seems strange that the string that is returned from the database is
not in ASCII at all.  This is the root of the problem.

Changing the encoding of the database to LATIN1 solves the problem.

Does this mean that I should not use SQL_ASCII databases with binary
data?  Can anyone tell me if there is a better solutions, or if I'm
doing something wrong here?

Thanks,

Marcus


Re: Data corruption with BYTEA and SQL_ASCII encoding

От
Barry Lind
Дата:
I have just committed a fix for this bug.  I have also built new jar
files and placed them up on the jdbc.postgresql.org website.

thanks,
--Barry

Marcus Better wrote:

> Hi,
>
> I am using PostgreSQL 7.1.3 with the latest (7.2) development JDBC
> driver.  My tables contain binary data in BYTEA columns.  I get
> strange errors when I read the data using getBytes() if my database
> has SQL_ASCII default encoding.
>
> The data I get has the correct length, but some characters (I believe
> 0xa0 and higher) are replaced with 0xfd characters.
>
> I traced the problem to the getBytes method in
> org/postgresql/jdbc2/ResultSet.java in the JDBC driver:
>
>   //Version 7.2 supports the bytea datatype for byte arrays
>   if (fields[columnIndex - 1].getPGType().equals("bytea"))
>   {
>       return PGbytea.toBytes(getString(columnIndex));
>   }
>
>
> I checked the actual contents of the column that is returned from the
> database, and it is a string which contains non-ascii characters, like
> this:
>
>    \012¿_Ãeo7\223\2316#Ph©\021ê\217\212åI\217k·h:"\230ÜÔ\034ÅW
>
> This string agrees with the contents of the database.  I also checked
> that PGbytea.toBytes() translates this string correctly.  So this
> leaves the call to getString().
>
> getString() tries to decode the string using the specified default
> encoding of the database (SQL_ASCII), and this indeed gives the
> erroneous results.
>
> It seems strange that the string that is returned from the database is
> not in ASCII at all.  This is the root of the problem.
>
> Changing the encoding of the database to LATIN1 solves the problem.
>
> Does this mean that I should not use SQL_ASCII databases with binary
> data?  Can anyone tell me if there is a better solutions, or if I'm
> doing something wrong here?
>
> Thanks,
>
> Marcus
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>