Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters

Поиск

Список

Период

Сортировка

От	Cosmin Prund
Тема	Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters
Дата	16 января 2020 г. 18:15:09
Msg-id	CAGU4dz-jq678=qccivjxJsrfG+jb0H-tLuHnPoJ=6BszDKJikQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters (Michael Lewis <mlewis@entrata.com>)
Ответы	Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Список	pgsql-performance

Дерево обсуждения

Hello Michael and hello again Tom, sorry for mailing you directly. I just hit Reply in gmail - I expected the emails to have a reply-to=Pgsql. Apparently they do not.

Running the same query with a different "Ver" produces a proper plan. Here's a non-redacted example (Ver=91):

EXPLAIN (ANALYZE, BUFFERS) select "IdRand", "IdColoana", "Valoare" from "LucrareBugetDate" where ("LucrareBugetVersiuneId" = 91) and ("LucrareBugetDateId" in (10,11));
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using "IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" on "LucrareBugetDate" (cost=0.56..4.95 rows=2 width=13) (actual time=3.617..3.631 rows=2 loops=1)
Index Cond: (("LucrareBugetVersiuneId" = 91) AND ("LucrareBugetDateId" = ANY ('{10,11}'::integer[])))
Buffers: shared hit=9 read=3
Planning time: 0.223 ms
Execution time: 3.663 ms
(5 rows)

I have reindex everything, not just this INDEX.

"reltuples" for this table is 41712436.

> I'd be curious of the fraction in the MCVs frequency list in stats indicates that rows with Ver = 92 are rare and therefore the index on only Ver column is sufficient to find the rows quickly.

There are 25 valid values for "Ver" in this database. I ran the query for all of them. The only one miss-behaving is "92". I ran the query with random values for Ver (invalid values), the query plan always attempts to use the index using both values.

I looked into "most_common_values" in pg_stats, this value (92) is not in that list.

Finally I ran "ANALYZE" again and now the problem went away. Running the query with Ver=92 uses the proper plan. I'm not happy with this - I know I haven't solved the problem (I've ran ANALYZE multiple times before).

On Thu, 16 Jan 2020 at 19:00, Michael Lewis <mlewis@entrata.com> wrote:

Does the behavior change with different values of Ver column? I'd be curious of the fraction in the MCVs frequency list in stats indicates that rows with Ver = 92 are rare and therefore the index on only Ver column is sufficient to find the rows quickly. What is reltuples for this table by the way?

I also wonder if the situation may be helped by re-indexing the "index on both columns" to remove any chance of issues on bloat in the index. Which order are the columns by the way? If Ver is first, is there also an index on only id column?. Since you aren't on v12, you don't get to re-index concurrently but I assume you know the work around of create concurrently (different name), drop concurrently (old one), and finally rename new index.

В списке pgsql-performance по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters