Re: Bug #659: lower()/upper() bug on ->multibyte<- DB

Поиск
Список
Период
Сортировка
От Enke, Michael
Тема Re: Bug #659: lower()/upper() bug on ->multibyte<- DB
Дата
Msg-id 3CD8EC9B.7519ECD6@wincor-nixdorf.com
обсуждение исходный текст
Ответ на Bug #659: lower()/upper() bug on ->multibyte<- DB  (pgsql-bugs@postgresql.org)
Список pgsql-bugs
Hello,

> This is not a bug but an expected behavior. Locale support expects an
> input string is encoded in ISO-8859-1 (because you set locale to
> de_DE) while you supply UTF-8.

What is the difference between an insert of string and a call to a function with a string argument?
Insert works well, output also, only the functions lower(), upper() and initcap() make problems.
This is also ok: select a from a where a = 'X'; -- X is german umlaut a, lowercase / german umlaut A, capital

> Try an explicit encoding converion function:
>
> select lower(convert('D'), 'LATIN1');

I tried: select lower(convert('X'), 'LATIN1'); -- X is german umlaut A, capital
but the result was the same:
ERROR: Could not convert UTF-8 to ISO8859-1

I than compiled postgres without locale support. I created a DB with -E UTF-8
I created a table and inserted UTF-8 char "0x00C4" (german umlaut A, capital)
I called "select lower(a) from a;"
Now, without locale support, I didn't get the error but I also didn't get
the right result. The right result would be UTF-8 char "0x00E4" (german umlaut a, lower case)
!independent of the locale!

Regards,
Michael Enke

Tatsuo Ishii wrote:
>
> > Short Description
> > lower()/upper() bug on ->multibyte<- DB
> >
> > Long Description
> > OS: Linux Kernel 2.4.4, PostgreSQL version 7.2.1
> > lower() and upper() doesn't work like expected for multibyte
> > databases. It is working fine for one-byte encoding.
> > The behaviour can be reproduced as follows:
> > at initdb: LC_CTYPE was set to de_DE
> > createdb -E UTF-8 name
> > export PGCLIENTENCODING=LATIN1
> > psql -U name
> > --------------------------------------------------
> > => select lower('D');  -- german umlaut A, capital
> > ERROR: Could not convert UTF-8 to ISO8859-1
> > -- I expected to see: d german umlaut a, lower case
>
> This is not a bug but an expected behavior. Locale support expects an
> input string is encoded in ISO-8859-1 (because you set locale to
> de_DE) while you supply UTF-8. Try an explicit encoding converion
> function:
>
> select lower(convert('D'), 'LATIN1');
>
> Note that '\304' must be an actual german umlaut A, capital character,
> not an octal espcaped notion.
> --
> Tatsuo Ishii

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tatsuo Ishii
Дата:
Сообщение: Re: Bug #659: lower()/upper() bug on ->multibyte<- DB
Следующее
От: "Joel Burton"
Дата:
Сообщение: Re: Bug #661: Update to previous bug report