Обсуждение: reproducible bug in I don't know what component
bug=# select * from example_objects where name = 'Модемы';
object_id | name
-----------+--------
2 | Мебель
2 | Модемы
(записей: 2)
bug=# select version();
version
---------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 7.4.2 on i386-redhat-linux-gnu, compiled by GCC i386-redhat-linux-gcc (GCC) 3.3.3 20040216 (Red Hat Linux
3.3.3-2.1)
(1 запись)
Do the following in an installation initdb'd in ru_RU.KOI8-R (It doesn't
happen if you initdb'd with UTF-8). You need to run psql in a locale
that is capable of russian letters, namely an UTF-8 locale, or a KOI8-R
locale. Then:
CREATE DATABASE bug WITH ENCODING='unicode';
\c bug
\i dump.sql
-- here you have to set client_encoding if you chose ru_RU.KOI8-R as the
locale for psql
-- set client_encoding to koi8r;
select * from example_objects where name = 'Модемы';
dump.sql is attached, the select statement is included in UTF-8.
Let me know if anything is missing.
--
Markus Bertheau <twanger@bluetwanger.de>
Вложения
Am Freitag, 23. Juli 2004 11:49 schrieb Markus Bertheau: > Do the following in an installation initdb'd in ru_RU.KOI8-R (It doesn't > happen if you initdb'd with UTF-8). You need to run psql in a locale > that is capable of russian letters, namely an UTF-8 locale, or a KOI8-R > locale. Then: > > CREATE DATABASE bug WITH ENCODING='unicode'; That's your problem. Your locale doesn't match your encoding. You need to use a compatible combination. -- Peter Eisentraut http://developer.postgresql.org/~petere/
=D0=92 =D0=9F=D1=82=D0=BD, 23.07.2004, =D0=B2 14:02, Peter Eisentraut =D0= =BF=D0=B8=D1=88=D0=B5=D1=82: > Am Freitag, 23. Juli 2004 11:49 schrieb Markus Bertheau: > > Do the following in an installation initdb'd in ru_RU.KOI8-R (It doesn't > > happen if you initdb'd with UTF-8). You need to run psql in a locale > > that is capable of russian letters, namely an UTF-8 locale, or a KOI8-R > > locale. Then: > > > > CREATE DATABASE bug WITH ENCODING=3D'unicode'; >=20 > That's your problem. Your locale doesn't match your encoding. You need = to=20 > use a compatible combination. What is happening in the server that this is required? --=20 Markus Bertheau <twanger@bluetwanger.de>
Markus Bertheau <twanger@bluetwanger.de> writes:
> Do the following in an installation initdb'd in ru_RU.KOI8-R (It doesn't
> happen if you initdb'd with UTF-8).
If this is a bug, it's a bug in the ru_RU.KOI8-R locale definition.
You can prove that the locale considers the strings equal without
Postgres at all:
[tgl@rh1 tgl]$ cat ru_data
root
root
ÅÅçÅÝÅçÅ£î
ÅÅÅÇÅçÅ¥î
[tgl@rh1 tgl]$ sort -u ru_data
root
ÅÅçÅÝÅçÅ£î
ÅÅÅÇÅçÅ¥î
[tgl@rh1 tgl]$ LC_ALL=ru_RU.KOI8-R sort -u ru_data
root
ÅÅçÅÝÅçÅ£î
[tgl@rh1 tgl]$
(The above is on an RHL 8.0 platform.)
regards, tom lane
Am Freitag, 23. Juli 2004 15:30 schrieb Markus Bertheau: > > That's your problem. Your locale doesn't match your encoding. You need > > to use a compatible combination. > > What is happening in the server that this is required? When you ask locale-aware functions to compare strings, convert to lower-case, or what the case may be, these functions expect the strings to have a certain encoding (after all they just receive a stream of bytes, so they cannot check the encoding themselves). So if the function thinks it's comparing two KOI8-R strings and you are actually passing UTF-8 strings, the results are going to be close to comparing garbage. -- Peter Eisentraut http://developer.postgresql.org/~petere/