Обсуждение: BUG #2261: ILIKE seems to be buggy on koi8 input

Поиск
Список
Период
Сортировка

BUG #2261: ILIKE seems to be buggy on koi8 input

От
"Evgeny Gridasov"
Дата:
The following bug has been logged online:

Bug reference:      2261
Logged by:          Evgeny Gridasov
Email address:      eugrid@fpm.kubsu.ru
PostgreSQL version: 8.1.2
Operating system:   Debian Linux
Description:        ILIKE seems to be buggy on koi8 input
Details:

my terminal is RU_ru.KOI8-R,
template1's encoding is UTF8.
ILIKE seems to be buggy when comparing russian strings,
while UPPER/LOWER works OK.

template1=# \encoding koi8;

try to get uppercase of some russian letters:
template1=# select upper('фыва');
 upper
-------
 ФЫВА
(1 row)

result is OK!

next, try to compare uppercase and lowercase using
ILIKE:
template1=# select true where 'фыва' ilike 'ФЫВА';
 bool
------
(0 rows)

OOPS! Nothing happened. But why?

try the same but with latin charset letters:

template1=# select true where 'asdf' ilike 'ASDF';
 bool
------
 t
(1 row)

Try to compare lowercase with lowercase (russian):

template1=# select true where 'фыва' ilike 'фыва';
 bool
------
 t
(1 row)

it works.

Re: BUG #2261: ILIKE seems to be buggy on koi8 input

От
Tom Lane
Дата:
"Evgeny Gridasov" <eugrid@fpm.kubsu.ru> writes:
> my terminal is RU_ru.KOI8-R,
> template1's encoding is UTF8.
> ILIKE seems to be buggy when comparing russian strings,
> while UPPER/LOWER works OK.

I'll bet that the database's locale setting is expecting some encoding
other than UTF8 :-(.  You need to have compatible locale and encoding
settings inside the database.  You didn't say exactly what the database
LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not
match UTF8.

            regards, tom lane

Re: BUG #2261: ILIKE seems to be buggy on koi8 input

От
Tom Lane
Дата:
Evgeny Gridasov <eugrid@fpm.kubsu.ru> writes:
> postgresql server starts with environment:
> LC_COLLATE=en_US.UTF-8
> LC_ALL=en_US.UTF-8
> LANG=en_US.UTF-8

Well, that setting shouldn't translate much except A-Z/a-z.  If you want
cyrillic upper/lower case conversions you need database's LC_CTYPE to be
ru_RU.something.

            regards, tom lane

Re: BUG #2261: ILIKE seems to be buggy on koi8 input

От
Evgeny Gridasov
Дата:
postgresql server starts with environment:

LC_COLLATE=en_US.UTF-8
LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8

I've tried to set different LC_COLLATE/LC_ALL/LANG settings
but it did not help.

I've tried to change my psql input to unicode russian, but it did not help, too.

'show all' says I've got lc_collate  and other lc_* set to en_US.UTF-8.
initdb was run with this locale.
It cannot be modified setting it in postgresql.conf (creation db constant?)
Should I reinit database to get this working or what?
If I should reinit db, what locale should I choose?

BTW, ~* syntax does not also work with upper/lower case russian letters,
while upper()/lower() still work ok.

On Wed, 15 Feb 2006 12:44:18 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

> "Evgeny Gridasov" <eugrid@fpm.kubsu.ru> writes:
> > my terminal is RU_ru.KOI8-R,
> > template1's encoding is UTF8.
> > ILIKE seems to be buggy when comparing russian strings,
> > while UPPER/LOWER works OK.
>
> I'll bet that the database's locale setting is expecting some encoding
> other than UTF8 :-(.  You need to have compatible locale and encoding
> settings inside the database.  You didn't say exactly what the database
> LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not
> match UTF8.
>
>             regards, tom lane


--
Evgeny Gridasov
Software Engineer
I-Free, Russia

Re: BUG #2261: ILIKE seems to be buggy on koi8 input

От
Peter Eisentraut
Дата:
Evgeny Gridasov wrote:
> It cannot be modified setting it in postgresql.conf (creation db
> constant?) Should I reinit database to get this working or what?

Yes.

> If I should reinit db, what locale should I choose?

Something like ru_RU.utf8.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/