Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters

Поиск
Список
Период
Сортировка
От Cosmin Prund
Тема Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters
Дата
Msg-id CAGU4dz-jq678=qccivjxJsrfG+jb0H-tLuHnPoJ=6BszDKJikQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters  (Michael Lewis <mlewis@entrata.com>)
Ответы Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters  (Michael Lewis <mlewis@entrata.com>)
Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
Hello Michael and hello again Tom, sorry for mailing you directly. I just hit Reply in gmail - I expected the emails to have a reply-to=Pgsql. Apparently they do not.

Running the same query with a different "Ver" produces a proper plan. Here's a non-redacted example (Ver=91):

EXPLAIN (ANALYZE, BUFFERS) select "IdRand", "IdColoana", "Valoare" from "LucrareBugetDate" where ("LucrareBugetVersiuneId" = 91) and ("LucrareBugetDateId" in (10,11));
                                                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using "IX_LucrareBugetDate_LucrareBugetVersiuneId_LucrareBugetDateId" on "LucrareBugetDate"  (cost=0.56..4.95 rows=2 width=13) (actual time=3.617..3.631 rows=2 loops=1)
   Index Cond: (("LucrareBugetVersiuneId" = 91) AND ("LucrareBugetDateId" = ANY ('{10,11}'::integer[])))
   Buffers: shared hit=9 read=3
 Planning time: 0.223 ms
 Execution time: 3.663 ms
(5 rows)

I have reindex everything, not just this INDEX.

"reltuples" for this table is 41712436.

> I'd be curious of the fraction in the MCVs frequency list in stats indicates that rows with Ver = 92 are rare and therefore the index on only Ver column is sufficient to find the rows quickly.

There are 25 valid values for "Ver" in this database. I ran the query for all of them. The only one miss-behaving is "92". I ran the query with random values for Ver (invalid values), the query plan always attempts to use the index using both values.
I looked into "most_common_values" in pg_stats, this value (92) is not in that list.
Finally I ran "ANALYZE" again and now the problem went away. Running the query with Ver=92 uses the proper plan. I'm not happy with this - I know I haven't solved the problem (I've ran ANALYZE multiple times before).


On Thu, 16 Jan 2020 at 19:00, Michael Lewis <mlewis@entrata.com> wrote:
Does the behavior change with different values of Ver column? I'd be curious of the fraction in the MCVs frequency list in stats indicates that rows with Ver = 92 are rare and therefore the index on only Ver column is sufficient to find the rows quickly. What is reltuples for this table by the way?

I also wonder if the situation may be helped by re-indexing the "index on both columns" to remove any chance of issues on bloat in the index. Which order are the columns by the way? If Ver is first, is there also an index on only id column?. Since you aren't on v12, you don't get to re-index concurrently but I assume you know the work around of create concurrently (different name), drop concurrently (old one), and finally rename new index.

В списке pgsql-performance по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Bad query plan decision when using multiple column index - postgresql uses only first column then filters
Следующее
От: Laurenz Albe
Дата:
Сообщение: Re: Bad query plan decision when using multiple column index -postgresql uses only first column then filters