Обсуждение: bit datatype and getObject()
Hi, the JDBC driver seems to map columns defined as "bit" to Boolean regardless of the definition of the column. So even a bit(5)is returned as boolean value which I think is not correct. Consider the following CREATE TABLE bit_test (some_bits bit(5)); INSERT INTO bit_test VALUES ('01010'); When I run a select in psql it will return 01010 as the column's value. When running a select from within Java and using getObject() on that column a java.lang.Boolean is returned. This might be correct for a bit(1) but not for columns defined with a bigger width. It is returned correctly when using getString() on that column though. But for applications using the generic getObject() this returns a wrong value. I'm not sure what the JDBC specs requires here, but this mapping somehow is confusing. I tested this with PG 9.0 and the JDBC4 build 801 driver. Regards Thomas
Thomas Kellerer wrote: > Hi, > > the JDBC driver seems to map columns defined as "bit" to Boolean > regardless of the definition of the column. So even a bit(5) is returned > as boolean value which I think is not correct. > > Consider the following > > CREATE TABLE bit_test (some_bits bit(5)); > INSERT INTO bit_test VALUES ('01010'); > > When I run a select in psql it will return 01010 as the column's value. > > When running a select from within Java and using getObject() on that > column a java.lang.Boolean is returned. > > This might be correct for a bit(1) but not for columns defined with a > bigger width. > > It is returned correctly when using getString() on that column though. > But for applications using the generic getObject() this returns a wrong > value. > > I'm not sure what the JDBC specs requires here, but this mapping somehow > is confusing. > > I tested this with PG 9.0 and the JDBC4 build 801 driver. > > Regards > Thomas It is true that the mapping is java.lang.Boolean for the bit(x), but there is another type in PostgreSQL to define a multi-bit type. Have you tried: bit varying(5) retrieval by getObject()? Because that type does map to java.lang.Object. The PostgreSQL manual 8.10 though indicates that both are: "Bit String Types". Seems appropriate that getString() then is how to retrieve these types. Either way isn't this a server issue on the mapping, not the JDBC? I'm not sure but does not the JDBC just return the mapping from the server? Attached current data types mapping for majority of data types of server 9.0.1 and JDBC 9.0-801. danap.
Вложения
On Tue, 23 Nov 2010, Thomas Kellerer wrote: > the JDBC driver seems to map columns defined as "bit" to Boolean regardless > of the definition of the column. So even a bit(5) is returned as boolean > value which I think is not correct. > > Consider the following > > CREATE TABLE bit_test (some_bits bit(5)); > INSERT INTO bit_test VALUES ('01010'); > > When I run a select in psql it will return 01010 as the column's value. > > When running a select from within Java and using getObject() on that column a > java.lang.Boolean is returned. > > This might be correct for a bit(1) but not for columns defined with a bigger > width. > > I'm not sure what the JDBC specs requires here, but this mapping somehow is > confusing. Yeah, losing data is not good. Previously the discussion on list was to use java.util.BitSet, but it never resulted in a patch. I'm not sure if we can retain backwards compatibility for single bit length bitstrings. So we'd have to determine whether we can maintain compatibility and if not, is it still worth it to use BitSet. Kris Jurka
dmp, 24.11.2010 04:06: > It is true that the mapping is java.lang.Boolean for the bit(x), but > there is another type in PostgreSQL to define a multi-bit type. Have > you tried: bit varying(5) retrieval by getObject()? Because that type > does map to java.lang.Object. Correct. The only thing I noticed is that it returns Types.OTHER as the datatype, not Types.BIT But as I don't know if Types.BIT is meant to include "multi-bit" types I don't know if that is correct or not. > Either way isn't this a server issue on the mapping, not the JDBC? I don't think so. psql displays this correctly. Regards Thomas
On Wed, 24 Nov 2010, Kris Jurka wrote: > On Tue, 23 Nov 2010, Thomas Kellerer wrote: > >> the JDBC driver seems to map columns defined as "bit" to Boolean regardless >> of the definition of the column. So even a bit(5) is returned as boolean >> value which I think is not correct. >> > > Yeah, losing data is not good. Previously the discussion on list was to use > java.util.BitSet, but it never resulted in a patch. > I've looked at this in some more detail and I don't think BitSet matches up closely enough with bit(n)/varbit to make it work. BitSet does not store the total bit string length, only the length until the highest set bit. So if you have a BitSet object you can make a safe roundtrip to the database and back, but you can't make a safe roundtrip from the database to a BitSet and back. If you have a bit/varbit field that has a zero in the most significant bit, it will not go back to the database the same. This causes some problems on the database side, for example: CREATE TEMP TABLE bittest (a bit(3)); INSERT INTO bittest VALUES ('010'); Once you pull this into a BitSet object you can't reconstruct the original and you are left with an error like this for bit types: UPDATE bittest SET a = '10'::varbit; ERROR: bit string length 2 does not match type bit(3) For varbit strings you can store the new value, but then later operations may fail: CREATE TEMP TABLE varbittest(a varbit(3), b varbit(3)); INSERT INTO varbittest VALUES ('010', '111'); SELECT a & b FROM varbittest; UPDATE varbittest SET a = '10'::varbit; SELECT a & b FROM varbittest; ERROR: cannot AND bit strings of different sizes So if the problem was that tons of people on the Java side had BitSets they wanted to put into the database and retrieve, this might be tolerable. That's not our problem though, we're really looking at the database side of things and trying to represent that in Java, so we need another approach. At the moment, returing a String seems easiest, do other people have ideas? Kris Jurka
Kris Jurka <books@ejurka.com> wrote: > returing a String seems easiest, do other people have ideas? What would the Java String look like for SQL B'101' ? Would bool[] make any sense for an object type? I'm not sure byte[] would be out of the question, but it would need zero padding on one side or the other, and you would lose the exact length. :-( What do other drivers do? Is there any guidance in the JDBC standard? -Kevin
On Wed, 22 Dec 2010, Kevin Grittner wrote: > What would the Java String look like for SQL B'101' ? "101" > Would bool[] make any sense for an object type? Conceivably. > I'm not sure byte[] would be out of the question, but it would need > zero padding on one side or the other, and you would lose the exact > length. :-( boolean[] seems better to me than byte[]. One of the appeals of a distinct/unique type like BitSet is that you can then implement PreparedStatement.setObject and have it implement the reverse transform to pass the data to the server with type safety that you wouldn't get using a String. At the moment we don't do anything intelligent with native array types, but if we did we wouldn't know if boolean[] should turn into varbit or bit(1)[]. > What do other drivers do? Is there any guidance in the JDBC > standard? The standard says nothing useful. Section 8.3.3 of this document implies that as an application developer you should just pretend multi-bit strings don't exist... http://download.oracle.com/javase/1.5.0/docs/guide/jdbc/getstart/mapping.html Mysql's documentation shows it using byte[]. http://dev.mysql.com/doc/refman/5.5/en/connector-j-reference-type-conversions.html Oracle doesn't look like it supports it. I think IBM uses byte[], I'm not familiar with DB2 data types to understand what's the equivalent of bytea and what's varbi, but they both seem to be mapped to byte[]. http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp?topic=/com.ibm.db2.udb.apdv.java.doc/doc/rjvjdata.htm Kris Jurka
Kris Jurka <books@ejurka.com> wrote: > The standard says nothing useful. Section 8.3.3 of this document > implies that as an application developer you should just pretend > multi-bit strings don't exist... BitSet is tempting, but the fuzzy definition of size seems to be a killer. We clearly don't want to use length() instead of size(). Unless someone can make a convincing argument to the contrary, I'm inclined to agree that we should just us a Java String of '0' and '1' characters. -Kevin