Re: Tsearch2 and Unicode?

Поиск
Список
Период
Сортировка
От Markus Wollny
Тема Re: Tsearch2 and Unicode?
Дата
Msg-id 2266D0630E43BB4290742247C89105750680F7C4@dozer.computec.de
обсуждение исходный текст
Ответ на Tsearch2 and Unicode?  (Dawid Kuroczko <qnex42@gmail.com>)
Ответы Re: Tsearch2 and Unicode?
Список pgsql-general
Hi!

Hi!

Oleg, what exactly do you mean by "tsearch2 doesn't support unicode yet"?

It does seem to work fine in my database, it seems:

./pg_controldata [mycluster] gives me
pg_control version number:            72
[...]
LC_COLLATE:                           de_DE.UTF-8
LC_CTYPE:                             de_DE.UTF-8

community_unicode=# SELECT pg_encoding_to_char(encoding) AS encoding FROM pg_database WHERE
datname='community_unicode';
 encoding
----------
 UNICODE
(1 row)

community_unicode=# select to_tsvector('default_german', 'Ich fände, daß das Fehlen von Umlauten ein Ärgernis wäre.');
                           to_tsvector
------------------------------------------------------------------
 'daß':3 'wäre':10 'fehlen':5 'fände':2 'umlauten':7 'Ärgernis':9
(1 row)

community_unicode=# SELECT  message_id
community_unicode-#  , rank(idxfti, to_tsquery('default_german', 'Könige|Söldner'),0) as rank
community_unicode-#  FROM ct_com_board_message
community_unicode-#  WHERE idxfti @@ to_tsquery('default_german', 'Könige|Söldner')
community_unicode-#  order by rank desc
community_unicode-#  limit 10;
 message_id |   rank
------------+----------
    3191632 | 0.686189
    2803233 | 0.686189
    2935325 | 0.686189
    2882337 | 0.686189
    2842006 | 0.686189
    2854329 | 0.686189
    2841962 | 0.686189
    2999851 | 0.651322
    2869839 | 0.651322
    2999799 |  0.61258
(10 rows)

These results look alright to me, so I cannot reproduce this phenomenon of disappearing special characters in a
unicode-database.Dawid, are you sure, you INITDB'd your cluster to the correct locale-settings? 

Kind regards

   Markus

> -----Ursprüngliche Nachricht-----
> Von: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org] Im Auftrag von
> Oleg Bartunov
> Gesendet: Mittwoch, 17. November 2004 17:32
> An: Dawid Kuroczko
> Cc: Pgsql General
> Betreff: Re: [GENERAL] Tsearch2 and Unicode?
>
> Dawid,
>
> unfortunately, tsearch2 doesn't support unicode yet.
> If you keep tsvector separately from data than you'll need
> one more join.
>
>      Oleg
>

В списке pgsql-general по дате отправления:

Предыдущее
От: frbn
Дата:
Сообщение: ERROR: Unable to locate type oid 0 in catalog...
Следующее
От: Mike Richards
Дата:
Сообщение: A couple serious errors