Re: autoanalyze creates bad plan, manual analyze fixes it?

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: autoanalyze creates bad plan, manual analyze fixes it?
Дата
Msg-id 20200914234132.GF18552@telsasoft.com
обсуждение исходный текст
Ответ на autoanalyze creates bad plan, manual analyze fixes it?  (Robert Treat <rob@xzilla.net>)
Ответы Re: autoanalyze creates bad plan, manual analyze fixes it?  (Robert Treat <rob@xzilla.net>)
Список pgsql-performance
On Mon, Sep 14, 2020 at 07:11:12PM -0400, Robert Treat wrote:
> Howdy folks,
> 
> Recently i've run into a problem where autoanalyze is causing a query
> plan to flip over to using an index which is about 10x slower, and the
> problem is fixed by running an alayze manually. some relevant info:

I think it's because 1) the costs and scan rowcounts are similar ; and, 2) the
stats are probably near some threshold which causes the plan to change.  I'm
guessing if you run a manual ANALYZE 100 times, you'll sometimes get the bad
plan.  Maybe depending on the data visible at the time analyze is invoked.

> UPDATE sleeping_intents SET
> raptor_after='2020-09-14T19:21:03.581106'::timestamp,
> status='requires_capture',
> updated_at='2020-09-14T16:21:03.581104+00:00'::timestamptz WHERE
> sleeping_intents.id = 'r2d2dcc0-8a44-4d19-c3p0-28522233b836'::uuid AND
> sleeping_intents.status = 'finna' RETURNING *;

Do you mean status='init' ??

> The plan generated after autoanalyze is:
> 
> Update on sleeping_intents  (cost=0.70..7.11 rows=1 width=272) (actual time=57.945..57.945 rows=0 loops=1)
>    Buffers: shared hit=43942
>    ->  Index Scan using sleeping_intents_status_created_at_raptor_after_idx on sleeping_intents  (cost=0.70..7.11
rows=1width=272) (actual time=57.943..57.943 rows=0 loops=1)
 
>          Index Cond: (status = 'init'::text)
>          Filter: (id = 'r2d2dcc0-8a44-4d19-c3p0-28522233b836'::uuid)
>          Rows Removed by Filter: 1262
>          Buffers: shared hit=43942
>  Planning time: 0.145 ms
>  Execution time: 57.981 ms
> 
> after i run analyze manually, the query plan is changed to this:
> 
> Update on sleeping_intents  (cost=0.57..8.59 rows=1 width=272) (actual time=0.023..0.023 rows=0 loops=1)
>    Buffers: shared hit=7
>    ->  Index Scan using sleeping_intents_pkey on sleeping_intents (cost=0.57..8.59 rows=1 width=272) (actual
time=0.022..0.022rows=0 loops=1)
 
>          Index Cond: (id = 'r2d2dcc0-8a44-4d19-c3p0-28522233b836'::uuid)
>          Filter: (status = 'init'::text)
>          Rows Removed by Filter: 1
>          Buffers: shared hit=7
>  Planning time: 0.092 ms
>  Execution time: 0.066 ms
> 
> Note that in the second query, it switches back to using the primary
> key index, which does seem like the logically better choice, even
> though it shows a higher estimated cost than the "bad" case
> (understanding the cost must change somewhere in the process, but
> there no way to see it afaict).

If you SET enable_indexscan=off you can try to get an bitmap index scan, which
will reveal how much of the cost is attributed to the index component and how
much to the heap.  That might help to refine costs, which may help.

> Statistics after manual analyze:
>        tablename    |    attname    | null_frac | avg_width |
> n_distinct | correlation |                   most_common_freqs
>
-----------------+---------------+-----------+-----------+------------+-------------+--------------------------------------------------------
>  sleeping_intents | id            |         0 |        16 |         -1
> | -0.00133045 | [null]
>  sleeping_intents | status        |         0 |         9 |          6
> |    0.848468 | {0.918343,0.0543667,0.0267567,0.000513333,1e-05,1e-05}
>  sleeping_intents | created_at    |         0 |         8 |         -1
> |    0.993599 | [null]
>  sleeping_intents | raptor_after | 0.0663433 |         8 |  -0.933657
> |     0.99392 | [null]
> 
> In a previous go around with this table, I also increased the
> statistics target for the id column to 1000, vs 100 which is the
> database default.

What about status ?
I wonder if sometimes the sample doesn't include *any* rows for the 1e-5
statuses.  So the planner would estimate the rowcount based on ndistinct and
the other frequencies.  But if you rerun analyze, then it thinks it'll get one
row based on the sampled frequency of status.  

What postgres version, and what non-default settings ?
Maybe you can run explain(settings,...).

-- 
Justin



В списке pgsql-performance по дате отправления:

Предыдущее
От: Robert Treat
Дата:
Сообщение: autoanalyze creates bad plan, manual analyze fixes it?
Следующее
От: Pavel Stehule
Дата:
Сообщение: Re: autoanalyze creates bad plan, manual analyze fixes it?