Re: [GENERAL] FTS query, statistics and planner estimations…

Поиск
Список
Период
Сортировка
От Francisco Olarte
Тема Re: [GENERAL] FTS query, statistics and planner estimations…
Дата
Msg-id CA+bJJbx0yviM9KE25AEyZmx6aGRXuQZVyHd2dHezSPf0JXRqrA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: FTS query, statistics and planner estimations…  (Pierre Ducroquet <pierre.ducroquet@people-doc.com>)
Список pgsql-general
On Wed, Nov 9, 2016 at 11:19 AM, Pierre Ducroquet
<pierre.ducroquet@people-doc.com> wrote:
> Indeed the words in the query are correlated, but I do hope that the FTS
> indexing is able to cope with that.

If the query returns correct results in reasonable time it can. OTOH
the planner, and the statistics system, is another beast. Correlation
info in FTS is HUGE, and the planner is supposed to work with a
smallish summary of the index.

> Otherwise it makes it far less usable than
> what one would expect since real world queries will often contain sentences or
> related words.

Well, I concur it would be great to have it, but having written FTS
engines I suspect it would be difficult to have it AND maintain it. I
have built an FTS system, and I built an index as a compressed list of
(stemed-word, document, position), and then compressed it. The
information for word-word correlation would be huge, as its
cardinality could grow with n^2. Especially if you have to keep it in
an updatabale format. And it would not help you for the three, four,
etc.. cases. And even then, the optimizer may be spending a lot of
time reading and processing it, as it would not fit easily in the
cache.

> Also, PostgreSQL 9.6 introduced phrase search in FTS, and I
> don't see how that would work without a working multi-words query.

Queries work, is just they are not as fast as you want/expect them to
be. Phrase search is normally done by locating documents with all the
words and then filtering, just with the index if it includes word
position or by reading the docs. In general, in FTS, you need to use
selective terms for fast queries.

Francisco Olarte.


В списке pgsql-general по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: [GENERAL] FTS query, statistics and planner estimations…
Следующее
От: Tom Lane
Дата:
Сообщение: Re: ENABLE ROW LEVEL SECURITY cause huge produce of checkpoints