Re: to_ascii, or some other form of magic transliteration

Поиск
Список
Период
Сортировка
От Mike Rylander
Тема Re: to_ascii, or some other form of magic transliteration
Дата
Msg-id b918cf3d050910053029faae73@mail.gmail.com
обсуждение исходный текст
Ответ на to_ascii, or some other form of magic transliteration  (Ben <bench@silentmedia.com>)
Ответы Re: to_ascii, or some other form of magic transliteration  (Ben <bench@silentmedia.com>)
Список pgsql-general
On 9/9/05, Ben <bench@silentmedia.com> wrote:
> I'm working on a problem that I imagine others have had, which basically
> boils down to having nice unicode display text that users are going to
> want to search against without typing it correctly.... e.g. let a search
> for "sma" match "små". It seems like the best way to do this is to find
> a magic unicode transliteration mapping function, and then save the
> ASCII transliterations for searching against.
>

The simplest solution to this that I've found is to maintain a
separate column for ASCII-ized version of your text.  The conversion
can be done automatically using a trigger, and I have one in PL/PERLU
that I use.  It basically boils down to:

1) transform unicode text to normal form D
2) strip combining non-spacing marks

In modern Perls that looks like:

#--------------
use Unicode::Normalize;
my $txt = NFD(shift());
$txt =~ s/\pM//og;
return $txt;
#--------------

Hope that helps!

> I see there's a function to_ascii, which sounds hopeful. However, when I
> try to use it, I get back:
>
> ERROR:  encoding conversion from UNICODE to ASCII not supported
>
> What is this function for, if not to convert other encodings to ASCII?
> Is there some other way to do what I'm asking for?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>


--
Mike Rylander
mrylander@gmail.com
GPLS -- PINES Development
Database Developer
http://open-ils.org

В списке pgsql-general по дате отправления:

Предыдущее
От: Bruno Wolff III
Дата:
Сообщение: Re: Postgresql Hosting
Следующее
От: Douglas McNaught
Дата:
Сообщение: Re: back references using regex