Re: Text search prefix matching and stop words

Поиск
Список
Период
Сортировка
От Pavel Borisov
Тема Re: Text search prefix matching and stop words
Дата
Msg-id CALT9ZEG-i0prBw5N7pMAPqL_Kj=g_xK-oKjumE6-q0TVvOfB4A@mail.gmail.com
обсуждение исходный текст
Ответ на Text search prefix matching and stop words  ("Matthew Nelson" <mnelson@binarykeep.com>)
Ответы Re: Text search prefix matching and stop words  (Pavel Borisov <pashkin.elfe@gmail.com>)
Re: Text search prefix matching and stop words  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Text search prefix matching and stop words  (Artur Zakirov <zaartur@gmail.com>)
Список pgsql-bugs
Prefix matching should not omit stop words, as matching lexemes may legitimately begin with stop words.

# select to_tsquery('english', 'over:*') @@ to_tsvector('english', 'overhaul');
NOTICE:  text-search query contains only stop words or doesn't contain lexemes, ignored
 ?column?
----------
 f
(1 row)

I noticed this after implementing interactive, incremental search in an application. As the user typed "overhaul," with each successive character executing a search, "ove" and "overh" matched a particular document, but "over" did not.

Big thanks for the reporting! 

I am not sure that it is a bug. I think this is a way how to_tsquery conversion work: stopwords first then template processing.

If you want to process successive characters typing, you can use casting to tsvector type until input is not finished

'over:*'::tsquery;

and when the user finishes input then process the result via to_tsquery with stop words.

if we do to_tsquery in a way you described I expect it will never apply the stop-word filter on templated input as it can not be compared to stop words.

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com

В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Matthew Nelson"
Дата:
Сообщение: Text search prefix matching and stop words
Следующее
От: Pavel Borisov
Дата:
Сообщение: Re: Text search prefix matching and stop words