Re: Flexible configuration for full-text search

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Flexible configuration for full-text search
Дата
Msg-id 26006.1536679481@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Flexible configuration for full-text search  (Aleksandr Parfenov <a.parfenov@postgrespro.ru>)
Ответы Re: Flexible configuration for full-text search  (Dmitry Dolgov <9erthalion6@gmail.com>)
Список pgsql-hackers
Aleksandr Parfenov <a.parfenov@postgrespro.ru> writes:
> As I wrote few weeks ago, there is a issue with stopwords processing in
> proposed syntax for full-text configurations. I want to separate word
> normalization and stopwords detection to two separate dictionaries. The
> problem is how to configure stopword detection dictionary.

> The cause of the problem is counting stopwords, but not using any
> lexemes for them. However, do we have to count stopwords during words
> counting or can we ignore them like unknown words? The problem I see is
> backward compatibility, since we have to regenerate all queries and
> vectors. But is it real problem or we can change its behavior in this
> way?

I think there should be a pretty high bar for forcing people to regenerate
all that data when they haven't made any change of their own choice.

Also, I'm not very clear on exactly what you're proposing here, but it
sounds like it'd have the effect of changing whether stopwords count in
phrase distances ('a <N> b').  I think that's right out --- whether or not
you feel the current distance behavior is ideal, asking people to *both*
rebuild all their derived data *and* change their applications will cause
a revolt.  It's not sufficiently obviously broken that we can change it.

            regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: StandbyAcquireAccessExclusiveLock doesn't necessarily
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors