Обсуждение: tsvector stemmer issue

Поиск
Список
Период
Сортировка

tsvector stemmer issue

От
Jeff Trout
Дата:
ran into an interesting issue - and I’m not sure if anything can be done about it - the snowball stemmer treats
“severance”and “several” as the same, which for me is a big, big issue. 

even quoting it doesn’t help.
indie=> select to_tsvector('severance several');
 to_tsvector
-------------
 'sever':1,2
(1 row)

indie=> select to_tsvector('"severance" several');
 to_tsvector
-------------
 'sever':1,2
(1 row)

using the perl library Lingua::Stem::Snowball it yields the same results (as expected since they both use snowball).

am I SOL here?

—
Jeff Trout <jeff@jefftrout.com>



Re: tsvector stemmer issue

От
Kevin Grittner
Дата:
Jeff Trout <threshar@real.jefftrout.com> wrote:

> ran into an interesting issue - and I’m not sure if anything can
> be done about it - the snowball stemmer treats “severance” and
> “several” as the same, which for me is a big, big issue.

You can create a custom dictionary chain.  The only type I worked
with was thesaurus, but it was pretty easy once I read the relevant
docs.  It is only custom *parsers* that are a pain, but it doesn't
sound like you need that.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company