Re: tsearch in core patch

Поиск
Список
Период
Сортировка
От Euler Taveira de Oliveira
Тема Re: tsearch in core patch
Дата
Msg-id 467D4496.4000208@timbira.com
обсуждение исходный текст
Ответ на Re: tsearch in core patch  (Alvaro Herrera <alvherre@commandprompt.com>)
Ответы Re: tsearch in core patch  (Oleg Bartunov <oleg@sai.msu.su>)
Список pgsql-hackers
Alvaro Herrera wrote:

> What I was really suggesting was having a table mapping locale names
> into "tsearch languages".  Then the configuration could be made based on
> the language, not on the locale name.  So the stopword list is for
> "russian", regardless of whether the locale is Russian_Russia or ru_RU.
> 
Agreed. But I'm afraid we couldn't map all of the locale names in a
right way. Man, it's a large list. ;)

> Is this only for the stopword list, or does it also affect selecting a
> stemmer?
> 
Both.

> Note: it's possible that the stopword list is different for brazilian
> portuguese than portuguese portuguese, which is why I was suggesting
> using a language "portuguese_brazil" and not just "postuguese".  Whereas
> you need a single stopword list for all the countries speaking spanish,
> which is why you need only one language called spanish.
> 
Indeed it's possible for portuguese, because we have some words that are
written in different ways, e.g.,
pt_BR     pt_PT     english
Mônica    Mónica    Monica
ação      acção     action
Irã       Irão      Iran
.
.
.

Will it be possible to disable stemming or stopwords removal? I'm asking
this 'cause sometimes stemming doesn't lead to good results and/or
stopwords are relevant. Maybe it could be an GUC variables
('enable_stemming' and 'enable_stopwords').


--  Euler Taveira de Oliveira http://www.timbira.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Bugtraq: Having Fun With PostgreSQL
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: Bugtraq: Having Fun With PostgreSQL