unicode searches failing that use % and LIKE operators

Поиск
Список
Период
Сортировка
От Benjamin Weaver
Тема unicode searches failing that use % and LIKE operators
Дата
Msg-id 20071022180038.CE858EB04D@webmail221.herald.ox.ac.uk
обсуждение исходный текст
Ответы Re: unicode searches failing that use % and LIKE operators  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
Dear all,

I have the following problem:  a compound search, involving 2 wildcarded
character search terms, in which one search term consists of Latin characters
and the other, of UTF-8 unicode Greek characters, fails.  This is strange,
because similar searches in which both terms are either unicode Greek or Latin
characters succeed.

Both terms query a column of type text.  Searches of this kind fail both via
JDBC and the PSQL interface via an xterm window (which handles unicode
properly).  The JDBC search will have put both terms in UTF-8.  The column is
called metadatafulltext.  The search uses the LIKE operator in conjunction with
the wildcard character:

For example, the following search fails:

SELECT ..

FROM metadatafulltext...

WHERE metadatafulltext LIKE '%Jones%' AND metadatafulltext LIKE '%ALPHABETA%'
(where ALPHABETA is actually a unicode Greek string (\u03b1\u03b2).


whereas searches using all Greek characters succeed:

WHERE metadatafulltext LIKE '%BETAEPSILONDELTA%' AND metadatafulltext LIKE
'%ALPHABETA%'


and equally, all-Latin searches also succeed:
WHERE metadatafulltext LIKE '%Jones%' AND metadatafulltext LIKE '%Smith%'



What must I do to ensure that mixed-term searches of the first kind succeed?


Thanks in advance,

Ben Weaver

--
Benjamin Weaver
Faculty Research Associate, Imaging Papyri Projects, Herculaneum Society, Oxford
email:  benjamin.weaver@classics.ox.ac.uk
phone:  (0)1865 610236


В списке pgsql-general по дате отправления:

Предыдущее
От: Martin Marques
Дата:
Сообщение: Bitmap Heap scan 8.1/8.2
Следующее
От: "Pavel Stehule"
Дата:
Сообщение: Re: Bitmap Heap scan 8.1/8.2