Re: Use of multi-column gin index

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: Use of multi-column gin index
Дата	3 июня 2019 г. 22:36:06
Msg-id	24354.1559601366@sss.pgh.pa.us обсуждение
Ответ на	Use of multi-column gin index (Jared Rulison <jared@affinity.co>)
Список	pgsql-hackers

Дерево обсуждения

Jared Rulison <jared@affinity.co> writes:
> Hello, we have some confusion over the planner's use of an index.
> ...
> 1. How is postgres able to use parades_city_id_description_tsv_index in the
> first explain result without any filter on "city_id"?

GIN indexes don't have any particular bias towards earlier or later
columns (unlike btrees).  So this isn't any harder than if you'd
put the index columns in the other order.

> 2. Why does the planner in the first query decide not to simply use
> parades_city_id_description_tsv_index (as in the second explain result)
> when the cardinality of the set of "city_id"s is high?

[ shrug... ]  It thinks it's cheaper.  Whether it's correct is impossible
to say from the given data, but there is a moderately complex cost model
in there.  The comments for gincost_scalararrayopexpr note

 * A ScalarArrayOpExpr will give rise to N separate indexscans at runtime,
 * each of which involves one value from the RHS array, plus all the
 * non-array quals (if any).

I haven't checked the actual execution code, but this seems to be saying
that the GIN indexscan executor always does ANDs before ORs.  That means
that doing everything in the same GIN indexscan would require executing
the to_tsvector part 50 times, so I can definitely believe that shoving
the IN part to a different index and AND'ing afterwards is a better idea.
(Whether the GIN executor should be made smarter to avoid that is a
separate question ;-))

            regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Use of multi-column gin index