Обсуждение: Using java.lang.Character for "char" data type

Поиск
Список
Период
Сортировка

Using java.lang.Character for "char" data type

От
Віталій Тимчишин
Дата:
Hello.

I've tried to use Character to fill "char" column, e.g. like in the next test:
-----
con.createStatement().execute("create table tst(tcol \"char\")");
PreparedStatement stmt = con.prepareStatement("insert into tst(tcol) values(?)");
stmt.setObject(1, 'c');
stmt.executeUpdate();
-----
and got 
Exception in thread "main" org.postgresql.util.PSQLException: Can't infer the SQL type to use for an instance of java.lang.Character. Use setObject() with an explicit Types value to specify the type to use. at org.postgresql.jdbc2.AbstractJdbc2Statement.setObject(AbstractJdbc2Statement.java:1740)

with 8.3-603.jdbc4 driver.
May be it would be reasonable to treat Character like String of size 1?

--
Best regards,
Vitalii Tymchyshyn

Re: Using java.lang.Character for "char" data type

От
Kris Jurka
Дата:

On Fri, 21 May 2010, ??????? ???????? wrote:

> I've tried to use Character to fill "char" column, e.g. like in the next
> test:
> -----
> con.createStatement().execute("create table tst(tcol \"char\")");
> PreparedStatement stmt = con.prepareStatement("insert into tst(tcol)
> values(?)");
> stmt.setObject(1, 'c');
> stmt..executeUpdate();
> -----
> and got 
> Exception in thread "main" org.postgresql.util.PSQLException: Can't infer
> the SQL type to use for an instance of java.lang.Character. Use setObject()
> with an explicit Types value to specify the type to use.
atorg.postgresql.jdbc2.AbstractJdbc2Statement.setObject(AbstractJdbc2Statemen
> t.java:1740)
>
> with 8.3-603.jdbc4 driver.
> May be it would be reasonable to treat Character like String of size 1?
>

It would be possible to support setObject for a Character, but be aware
that "char" is not a character long, it's a byte long, so it will fail on
multibyte characters.

Kris Jurka

Re: Using java.lang.Character for "char" data type

От
Craig Ringer
Дата:
On 22/05/2010 6:16 AM, Kris Jurka wrote:

> It would be possible to support setObject for a Character, but be aware
> that "char" is not a character long, it's a byte long, so it will fail
> on multibyte characters.

So really the appropriate SQL type for Java 'Character' is varchar(1) ?
And a PostgreSQL 'char' (bpchar, not character(1)) maps best to a Java
'Byte' ?

--
Craig Ringer

Re: Using java.lang.Character for "char" data type

От
Kris Jurka
Дата:

On Sat, 22 May 2010, Craig Ringer wrote:

> So really the appropriate SQL type for Java 'Character' is varchar(1) ?

Yes.

>  And a PostgreSQL 'char' (bpchar, not character(1)) maps best to a Java
> 'Byte' ?

I'm not so sure about that.  "char" is still intended to represent a
textual identifier, not a numeric value as Byte does.  Additionally you
can't store 0 in a "char".

When you're talking about a mapping there's two directions, setObject and
getObject, and they're not always symmetric.  If we allowed setObject with
a Character to map to varchar, then I would not expect getObject to return
a Character, I'd expect a String.

Kris Jurka

Re: Using java.lang.Character for "char" data type

От
Lew
Дата:
Kris Jurka wrote:
>> It would be possible to support setObject for a Character, but be aware
>> that "char" is not a character long, it's a byte long, so it will fail
>> on multibyte characters.

Craig Ringer wrote:
> So really the appropriate SQL type for Java 'Character' is varchar(1) ?
> And a PostgreSQL 'char' (bpchar, not character(1)) maps best to a Java
> 'Byte' ?

Not according to the PG documentation.

According to that, SQL type "CHAR" should be just fine for a Java 'char'.
Mostly.  (I worry about code points that take more than 2 bytes.)

I would not use "VARCHAR" to represent a Java 'char'.

--
Lew

Re: Using java.lang.Character for "char" data type

От
Lew
Дата:
Kris Jurka wrote:
> It would be possible to support setObject for a Character, but be aware
> that "char" is not a character long, it's a byte long, so it will fail
> on multibyte characters.

I'm a little confused.  When you say "char" is a byte long, are you referring
to the SQL type or the Java type?  I'm used to seeing the Java type expressed
in lower case and the SQL type in upper case, so please pardon my confusion.

The Java 'char' type is 16 bits wide.

Doesn't the width of the SQL "CHAR" depend on the encoding?

Otherwise how does it handle, say, UTF-8 when you tell the DB to use that?

To put it another way, suppose I enter a String that contains, say, 24 UTF-8
characters, some of which require multibyte encodings, and try to jam it into
a "CHAR(24)" column or a "VARCHAR(24)" column.  Will that cause trouble?

The documentation for CHAR and VARCHAR at
<http://www.postgresql.org/docs/8.4/interactive/datatype-character.html>
says
"SQL defines two primary character types: character varying(n) and
character(n), where n is a positive integer. Both of these types can store
strings up to n characters (not bytes) in length."

That seems to contradict what you said.

--
Lew

Re: Using java.lang.Character for "char" data type

От
"Kevin Grittner"
Дата:
Lew <noone@lewscanon.com> wrote:

> I'm a little confused.  When you say "char" is a byte long, are
> you referring to the SQL type or the Java type?  I'm used to
> seeing the Java type expressed in lower case and the SQL type in
> upper case, so please pardon my confusion.

PostgreSQL supports the standard CHAR(n) including the special case
of CHAR to mean one character, but also supports a type of "char"
(notice the lower-case letters and the quotes), which is distinct,
and is a one-byte type.  "char" is not intended to be used by
application code, generally.  It's a micro-optimization used for
system tables, and thus not very well documented.  Use at your own
risk.  Or use CHAR or VARCHAR (without quotes).

-Kevin

Re: Using java.lang.Character for "char" data type

От
Radosław Smogura
Дата:
Dnia sobota 22 maj 2010 o 03:11:52 Lew napisał(a):

> I'm a little confused.  When you say "char" is a byte long, are you
>  referring to the SQL type or the Java type?  I'm used to seeing the Java
>  type expressed in lower case and the SQL type in upper case, so please
>  pardon my confusion.
>
> The Java 'char' type is 16 bits wide.
>
> Doesn't the width of the SQL "CHAR" depend on the encoding?
>
> Otherwise how does it handle, say, UTF-8 when you tell the DB to use that?
>
> To put it another way, suppose I enter a String that contains, say, 24
>  UTF-8 characters, some of which require multibyte encodings, and try to
>  jam it into a "CHAR(24)" column or a "VARCHAR(24)" column.  Will that
>  cause trouble?
>
> The documentation for CHAR and VARCHAR at
> <http://www.postgresql.org/docs/8.4/interactive/datatype-character.html>
> says
> "SQL defines two primary character types: character varying(n) and
> character(n), where n is a positive integer. Both of these types can store
> strings up to n characters (not bytes) in length."
>
> That seems to contradict what you said.
>
As it was said Java char is 16 bits length and it can represent all Java
available caharacters (internally it's UTF-16 with small modification). When
you say about SQL CHAR it represents one character from all possible SQL
characters - nothing special, it's truism :)

This possible SQL characters, are described by DB encoding if you will uses 8
bit encoding (ASCII, ISO, etc) then you can put about 256 types of characters
in one slot (eg. in DB with ASCII encoding you can't put ISO-8859-2 letters,DB
encoding is used only to describe this possible characters and how those are
stored on disk in binary form (it's same as system encoding when you use text
based file read and write, or what internal encoding file system entry will
have).

Ofcourse CHAR(1) should store any character from set of possible DB's
characters (determined by encoding).

Difference between CHAR and VARCHAR is store strategy VARCHAR string is
variable length, but CHAR(n) string always has length n. If you will put to
CHAR(2) only "A" the select will return "A " (space at the end).

So answer for your question is "You should not care about bit length until
your DB encdoing will support all your characters or until your task is to
optimize or predicate the size of DB files". You should leave bit length of SQL
characters to DB engine. I use UTF-8 database and I see no problem with this.
If you don't belive you can check this.