Re: Latin vs non-Latin words in text search parsing

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Latin vs non-Latin words in text search parsing
Дата	21 октября 2007 г. 22:46:47
Msg-id	17599.1193006798@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: Latin vs non-Latin words in text search parsing (Alvaro Herrera <alvherre@commandprompt.com>)
Список	pgsql-hackers

Дерево обсуждения

Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> ISTM that perhaps a more generally useful definition would be
>> 
>> lword        Only ASCII letters
>> nlword        Entirely letters per iswalpha(), but not lword
>> word        Entirely alphanumeric per iswalnum(), but not nlword

> ... how about

> lword        Entirely letters per iswalpha, with at least one ASCII
> nlword        Entirely letters per iswalpha
> word        Entirely alphanumeric per iswalnum, but not nlword

Hmm.  Then we have no category for "entirely ASCII", which is an
interesting category at least from the English standpoint, and I think
also in a lot of computer-oriented contexts.  I think you may be putting
too much emphasis on the "Latin" aspect of the category name, which I
find to be a bit historical.  I'm not sure if it's too late to consider
renaming the categories; if we were willing to do that I'd propose
categories "aword", "naword", "word", defined as above.

Another thing that bothers me about your suggestion is that (at least in
some locales) iswalpha will return true for things that are neither
ASCII letters nor accented versions of them, eg Cyrillic letters.
So I'm not sure the surprise factor is any less with your approach
than mine: you could still get "lword" for something decidedly not
Latin-derived.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alvaro Herrera
Дата: 21 октября 2007 г., 22:00:06
Сообщение: Re: Latin vs non-Latin words in text search parsing

Следующее

От: Josh Berkus
Дата: 21 октября 2007 г., 23:48:33
Сообщение: Re: Ready for beta2?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Latin vs non-Latin words in text search parsing

Предыдущее

Следующее