Обсуждение: hu_HU.UTF8 case insensitive search fail to return values

Поиск
Список
Период
Сортировка

hu_HU.UTF8 case insensitive search fail to return values

От
"László-Róbert, Albert"
Дата:
Hi<br /> I have a problem related to the encoding. Maybe this is not the right place to ask for help, in this case
pleasetell me where to ask for help.<br /><br /> database version: PostgreSQL 8.1.2 on i686-pc-linux-gnu, compiled by
GCCgcc (GCC) 3.3.5 (Debian 1:3.3.5-13)<br /><br /> The problem is the following:<br /> I have created the database with
UTF8encoding (createdb -O postgres -E UTF8 dbname). The database uses the following encodings/locales:<br /><font
size="3"><fontcolor="#000000">lc_collate | hu_HU.UTF8 | Shows the collation order locale. <br /> lc_ctype | hu_HU.UTF8
|Shows the character classification and case conversion locale. <br /> lc_messages | C | Sets the language in which
messagesare displayed. <br /> lc_monetary | C | Sets the locale for formatting monetary amounts. <br /> lc_numeric | C
|Sets the locale for formatting numbers. <br /> lc_time | C | Sets the locale for formatting date and time values.<br
/><br/> Lats say we have a table named some_table with the following datas:<br /> create table some_table(id integer,
strvarchar(100));<br /><br /> insert into some_table values(1, 'alma a fa alatt');<br /></font></font><font
size="3"><fontcolor="#000000">insert into some_table values(2, 'nyári piros alma');<br /></font></font><font
size="3"><fontcolor="#000000">insert into some_table values(3, 'ez egy állomány');<br /></font></font><font
size="3"><fontcolor="#000000">insert into some_table values(4, 'egy másik ÁLLOMÁNY');</font></font><br /><font
size="3"><fontcolor="#000000"><br /> (note that I used a acute and A acute intentionally; these are some Hungarian, not
toomeaningful words)<br /><br /> When I execute the following query, the following results are returned:<br /><br />
SELECT* FROM some_table WHERE str ILIKE '%ál%"<br /><br /></font></font><table><tbody><tr><th class="data">id</th><th
class="data">str</th></tr><tr><tdclass="data1" nowrap><div align="right">3</div></td><td class="data1" nowrap>ez egy
állomány</td></tr></tbody></table><br/> for the following: <br /><font size="3"><font color="#000000">SELECT * FROM
some_tableWHERE str ILIKE '%Ál%"<br /> i get<br /></font></font><table><tbody><tr><th class="data">id</th><th
class="data">str</th></tr><tr><tdclass="data1" nowrap><div align="right">4</div></td><td class="data1" nowrap>egy másik
ÁLLOMÁNY</td></tr></tbody></table><br/><font size="3"><font color="#000000">instead of<br /></font></font><font
size="3"><fontcolor="#000000"><br /></font></font><table><tbody><tr><th class="data">id</th><th
class="data">str</th></tr><tr><tdclass="data1" nowrap><div align="right">3</div></td><td class="data1" nowrap>ez egy
állomány</td></tr><tr><tdclass="data1" nowrap><div align="right">4</div></td><td class="data1" nowrap>egy másik
ÁLLOMÁNY</td></tr></tbody></table><br/> for <br /> select * from some_table where str ilike '%á%'<br /> i get<br
/><table><tbody><tr><thclass="data">id</th><th class="data">str</th></tr><tr><td class="data1" nowrap><div
align="right">2</div></td><tdclass="data1" nowrap>nyári piros alma</td></tr><tr><td class="data2" nowrap><div
align="right">3</div></td><tdclass="data2" nowrap>ez egy állomány</td></tr><tr><td class="data1" nowrap><div
align="right">4</div></td><tdclass="data1" nowrap>egy másik ÁLLOMÁNY</td></tr></tbody></table><br /> but for <br />
select* from some_table where str ilike '%Á%'<br /> i get:<br /><table><tbody><tr><th class="data">id</th><th
class="data">str</th></tr><tr><tdclass="data1" nowrap><div align="right">4</div></td><td class="data1" nowrap>egy másik
ÁLLOMÁNY</td></tr></tbody></table><br/><font size="3"><font color="#000000"><br /> UPPER, LOWER and LIKE is working
perfectly.The problem appears to be just with the ILIKE operator.<br /><br /> NOTE: the following bug report, that had
notget any answer is similar to my problem: <a class="moz-txt-link-freetext"
href="http://svr5.postgresql.org/pgsql-bugs/2005-07/msg00157.php">http://svr5.postgresql.org/pgsql-bugs/2005-07/msg00157.php</a><br
/></font></font><divclass="moz-signature">-- <br /></div><table border="0" cellpadding="2" cellspacing="0"
style="text-align:left; width: 100%;"><tbody><tr><td colspan="1" rowspan="2" style="width: 200px;"><div
style="text-align:center;"><a href="mailto:mailto:rsoftsolutions@aim.com"><img alt="RSoft Solutions Ltd"
src="cid:part1.01060607.06090600@gmail.com"style="border: 0px solid ; width: 200px; height: 132px;" /></a><br
/></div><divstyle="text-align: center;">tel: +40-(0)-365-410-540</div></td><td style="width: 763px;"><big><big><span
style="font-weight:bold;">László-Róbert, </span></big></big><big><big><span style="font-weight:
bold;">Albert</span></big></big><br/><big><big><span style="font-weight: bold;"> <small><small>tel:
+40-(0)-742-874-854</small></small></span></big></big></td></tr><tr><tdstyle="width: 763px;">e-mail: <a
href="mailto:mailto:albertlaszlorobert@gmail.com">albertlaszlorobert@gmail.com</a><br/> e-mail: <a
href="mailto:mailto:albert_laszlorobert@yahoo.com">albert_laszlorobert@yahoo.com</a></td></tr></tbody></table><br/> 

Re: hu_HU.UTF8 case insensitive search fail to return values

От
Tom Lane
Дата:
"László-Róbert, Albert" <albertlaszlorobert@gmail.com> writes:
> [ ILIKE fails to match case-insensitively in 8.1.2 ]

The ILIKE code doesn't work well at all in multibyte encodings in 8.1
and before.  You should get reasonable results in one of the LATINn
encodings (and a matching locale of course).  If you need to use UTF8,
I recommend updating to 8.2 which handles multibyte characters better.
        regards, tom lane


Re: hu_HU.UTF8 case insensitive search fail to return values

От
Albert László-Róbert
Дата:
Tom Lane wrote:
> "László-Róbert, Albert" <albertlaszlorobert@gmail.com> writes:
>
>> [ ILIKE fails to match case-insensitively in 8.1.2 ]
>>
>
> The ILIKE code doesn't work well at all in multibyte encodings in 8.1
> and before.  You should get reasonable results in one of the LATINn
> encodings (and a matching locale of course).  If you need to use UTF8,
> I recommend updating to 8.2 which handles multibyte characters better.
>
>             regards, tom lane
>
>
the administrator finally configured properly the postgres 8.2, and the
reported problem looks to be fixed. i ran successfully the ILIKE test on
hungarian language.
thanks for the tip.
regards, laszlo-robert, albert