Re: Selecting on non ASCII varchars

Поиск
Список
Период
Сортировка
От Vadim Nasardinov
Тема Re: Selecting on non ASCII varchars
Дата
Msg-id 200510041714.01768.vadimn@redhat.com
обсуждение исходный текст
Ответ на Re: Selecting on non ASCII varchars  (Jeremy LaCivita <jlacivita@broadrelay.com>)
Список pgsql-jdbc
On Tuesday 04 October 2005 16:16, Jeremy LaCivita wrote:
> Hmmm
>
> so it turns out if i take all my Strings and do this:
>
> str = new String(str.getBytes(), "utf-8");
>
> then it works.
>
> Correct me if i'm wrong, but that says to me that the Strings were
> in UTF-8 already, but Java didn't know it, so it couldn't send them
> to postgres properly.

It's meaningless to ask what encoding a String has.  String are
sequence of chars -- they don't have an encoding.  The notion of
"encoding" comes into play only when you have to represent a String as
a sequence of bytes.

So, if this returns true for you:

   str.equals(new String(str.getBytes(), "utf-8"));

that means your default encoding is either utf-8 or a subset of utf-8,
at least for the characters found in str.

String#getBytes() uses the default encoding which may be specified via
the environment variable LANG on on Unix-like systems.

So, if my default encoding is UTF-8, I get this:

| $ echo $LANG
| en_US.UTF-8
| $ bsh2
| BeanShell 2.0-0.b1.7jpp - by Pat Niemeyer (pat@pat.net)
| bsh % print(System.getProperty("file.encoding"));
| UTF-8
| bsh % str = "Funny char: \u00e8";
| bsh % print(str);
| Funny char: è
| bsh % print(str.equals(new String(str.getBytes(), "utf-8")));
| true
| bsh %

If I change the default encoding to ISO-8859-1, I get this:

| $ env LANG=en_US.iso88591 bsh2
| BeanShell 2.0-0.b1.7jpp - by Pat Niemeyer (pat@pat.net)
| bsh % print(System.getProperty("file.encoding"));
| ISO-8859-1
| bsh % str = "Funny char: \u00e8";
| bsh % print(str);
| Funny char: è
| bsh % print(str.equals(new String(str.getBytes(), "utf-8")));
| false
| bsh %

В списке pgsql-jdbc по дате отправления:

Предыдущее
От: Jeremy LaCivita
Дата:
Сообщение: Re: Selecting on non ASCII varchars
Следующее
От:
Дата:
Сообщение: Connection to PostgreSQL server behind proxy