Re: Plans with bad estimates/paths

Поиск
Список
Период
Сортировка
От Joe Wildish
Тема Re: Plans with bad estimates/paths
Дата
Msg-id 56fa1c0c-feb5-441f-af48-792c4164ad06@www.fastmail.com
обсуждение исходный текст
Ответ на Re: Plans with bad estimates/paths  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-general
Hi Tomas,

On Tue, 16 Nov 2021, at 22:03, Tomas Vondra wrote:

> It sure smells like a case of correlated data for some clients but not
> others, but who knows ... Hard to say without seeing the nodes below the
> join. If the lower nodes are estimated accurately, then it's the join
> selectivity that is estimated poorly, and there's not much we can do
> about it :-(

Here is a link to a segment of a plan where the estimations degrade.
I've also pasted the plan segment at the end of this message.


https://gist.githubusercontent.com/joewildish/8eb66815d728399687b24647df941746/raw/ef8c62455dd34d76807feff5d164e6d8857014e2/gistfile1.txt

The nodes beneath the Merge Join seem to be estimated accurately (within 2x of the actual)? But, the Merge Join itself
isa 10x under-estimate and the ones above that are even further out.  You can see in the plan that this particular
executionis for multiple clients (144, 145, 146 & 155). In my experiments, I am getting better results with a single
client,although I don't know if that is down to the actual data size being smaller, or if the estimation for multiple
valuesis inherently more inaccurate. Anyway, is this an example of a join selectivity problem?
 

> Do the "good" plans have the same underestimate? Maybe there's just much
> less data for those clients, and the "poor" plan ends up being fast anyway?

I think that may be happening but haven't been able to capture a plan yet that confirms it.

>> I am assuming this underestimation is the source of the planner
>> choosing the "wrong" path; in production, we have had to resort to
>> setting the join and from collapse limits to 1 to force a naive plan
>> to be generated.  This is giving us execution times in the 10/20
>> second range vs. >45m in some cases.
>
> That may be happening, yes. So is it the join order that ends up being
> wrong, or the join methods?

I've seen both. For example, in the worst-performing plans, the root node has its first input estimated to produce 1
rowand its second input estimated to produce c.40,000 rows. The second input is a SeqScan, presumably because of the
single-rowestimate of its sibling. Of course, the estimate of 1 turns out to be wildly inaccurate, the join produces
c.2BNrows, and most are then filtered out.
 

In other (better) plans, the troublesome SeqScan doesn't exist: the relation in question gets joined lower down the
tree,and it is not traversed by a SeqScan.
 

> Have you tried increasing the collapse limit
> instead? Although, if it works for some queries but not others, that's
> likely not going to help.

Yes but it often creates poorer plans rather than better plans. In fact, I increase those limits locally when testing,
toget the planner to consistently produce what it thinks is the best plan. (I find without this, sub-paths can be
materialised,presumably because one of the collapse limits has been hit. Obviously I'd rather remove this particular
variabilitywhen trying to debug).
 

> The first thing I'd do is reduce the query size as much as possible. In
> this case I'd try removing as many joins as possible until the issue
> disappears. The simpler the query, the easier it is to investigate.
>
> And yes, replacing parts of a query with a temporary table is a common
> solution, because it's possible to collect statistics on it, build
> indexes etc. That usually solves estimation issues in multi-tenancy.
> Sometimes even a CTE with materialization is enough.

Thank you. It seems, then, that the solution lies in simplifying the queries such that the chances of poor estimation
arereduced/removed. (I have had some success with this today. One of the queries was bringing in a view which resulted
inneedless self-joins). However, such a solution begs the question -- which bits of the query should be pre-computed?
And,will such work survive further changes in the underlying data distributions?
 

Thanks,
-Joe

Plan segment:

Nested Loop Left Join  (cost=2.29..348095.52 rows=3 width=93) (actual time=828.619..3368.362 rows=517367 loops=1)
  ->  Nested Loop  (cost=1.86..348094.00 rows=3 width=81) (actual time=828.609..2655.136 rows=517367 loops=1)
        Join Filter: (clients.id = order_items.client_id)
        ->  Nested Loop  (cost=1.43..347875.73 rows=400 width=60) (actual time=828.603..1890.900 rows=517367 loops=1)
              Join Filter: (clients.id = order_items_1.client_id)
              ->  Merge Join  (cost=1.00..322712.24 rows=50370 width=48) (actual time=828.584..1224.993 rows=517367
loops=1)
                    Merge Cond: (order_items_2.client_id = clients.id)
                    ->  GroupAggregate  (cost=0.85..290718.67 rows=2518498 width=44) (actual time=0.040..1126.298
rows=1856351loops=1)
 
                          Group Key: order_items_2.client_id, order_items_2.order_item_id
                          ->  Merge Left Join  (cost=0.85..240348.71 rows=2518498 width=18) (actual time=0.033..535.466
rows=1858076loops=1)
 
                                Merge Cond: ((order_items_2.client_id = discount_allocations.client_id) AND
(order_items_2.order_item_id= discount_allocations.order_item_id))
 
                                ->  Index Only Scan using pk_order_items on order_items order_items_2
(cost=0.43..131145.90rows=2518498 width=12) (actual time=0.022..150.068 rows=1856352 loops=1)
 
                                      Heap Fetches: 0
                                ->  Index Scan using pk_discount_allocations on discount_allocations
(cost=0.42..89823.14rows=931889 width=18) (actual time=0.008..110.223 rows=677531 loops=1)
 
                    ->  Index Only Scan using pk_clients on clients  (cost=0.14..8.64 rows=4 width=4) (actual
time=0.003..0.011rows=4 loops=1)
 
                          Index Cond: (id = ANY ('{144,145,146,155}'::integer[]))
                          Heap Fetches: 0 
              ->  Index Only Scan using pk_order_items on order_items order_items_1  (cost=0.43..0.49 rows=1 width=12)
(actualtime=0.001..0.001 rows=1 loops=517367)
 
                    Index Cond: ((client_id = order_items_2.client_id) AND (order_item_id =
order_items_2.order_item_id))
                    Heap Fetches: 0 
        ->  Index Scan using pk_order_items on order_items  (cost=0.43..0.53 rows=1 width=29) (actual time=0.001..0.001
rows=1loops=517367)
 
              Index Cond: ((client_id = order_items_1.client_id) AND (order_item_id = order_items_1.order_item_id))
  ->  Index Scan using pk_order_item_variants on order_item_variants  (cost=0.43..0.51 rows=1 width=20) (actual
time=0.001..0.001rows=1 loops=517367)
 
        Index Cond: ((client_id = order_items_1.client_id) AND (order_item_id = order_items_1.order_item_id))



В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: check scripts after database code change
Следующее
От: Dennis
Дата:
Сообщение: Re: check scripts after database code change