Re: something better than pgtrgm?

Поиск
Список
Период
Сортировка
От Andrew Sullivan
Тема Re: something better than pgtrgm?
Дата
Msg-id 20121009132323.GF594@crankycanuck.ca
обсуждение исходный текст
Ответ на Re: something better than pgtrgm?  (Willy-Bas Loos <willybas@gmail.com>)
Ответы Re: something better than pgtrgm?  (Willy-Bas Loos <willybas@gmail.com>)
Список pgsql-general
On Tue, Oct 09, 2012 at 03:10:31PM +0200, Willy-Bas Loos wrote:
> >
> We're mixing species names of birds in greek and latin (scientific names),
> and all languages spoken in africa, europe and western asia.

Yike.

> I'm not very knowledgeable about scripts around the world, but i am afraid
> that the above list does include scripts that read from right to left.

It's much worse than that.

It includes at least two variations of Arabic keyboard (depending on
which language you are using, for instance, you get a different
Unicode encoding of the character YEH, which in some languages has
something approximating the frequency of the letter a in English), and
you have endless problems with dots versus no dots on Arabic-script
spellings (not all uses of Arabic the script are Arabic the
language).  You also run smack into the problem of correct syllable
formation in Brahmi-derived scripts.

If you're going to do something with this sort of language-agnostic
"did you mean" work, you will need to be extremely rigorous about
normalizing spellings on the way in.  Is that a possibility?  If so, I
can almost imagine a way this could work.  If not, well,
"internationalization is hard."  :-/

A

--
Andrew Sullivan
ajs@crankycanuck.ca


В списке pgsql-general по дате отправления:

Предыдущее
От: Willy-Bas Loos
Дата:
Сообщение: Re: something better than pgtrgm?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pgxs problem...