Plans with bad estimates/paths

Поиск
Список
Период
Сортировка
От Joe Wildish
Тема Plans with bad estimates/paths
Дата
Msg-id a0ea1dc3-bb13-4a77-a462-17f5d99185b0@www.fastmail.com
обсуждение исходный текст
Ответы Re: Plans with bad estimates/paths
Список pgsql-general
Hello,

I am struggling with a problem that appears planning related. I'm hoping folk here may be able to advise on how best to
tacklethe issue.
 

We have a system that ingests JSON messages containing order data. The data from these messages is inserted into a
normalisedtable structure; "order", "order_item", "order_discount", etc.  Orders can have multiple items, items can
havemultiple discounts and so forth.  Referential constraints exist between the tables in the expected way.  Fields in
thetables are all declared NOT NULL.  If the schema of the JSON is such that a field is optional, we reflect that
optionalityin another "order_<X>" table, and make "order_<X>" be a subset of "order".  Moreover, we ingest messages for
differentclients/customers.  Therefore each message-related table carries with it a client identifier which forms part
ofthe primary key on the table.  For example, "order" has a key of "(client_id, order-id)".
 

We have written transformations that calculate various facts about orders.  For example, one of the transforms emits
orderitem data where we have calculated the overall discount for the item, and have "filled in" some of the optional
fieldswith defaults, and have categorised the order item on the basis of some aspect of the order (e.g. "this is an
e-commerceorder, this is retail order").  These transforms are typically per client (e.g. `WHERE client_id = 123`)
althoughin some cases we transform over multiple clients (e.g. `WHERE client_id = ANY (SELECT client_id FROM clients
WHERE...)`).
 

The issue is that for some clients, or combination of clients, the planner is choosing a path that takes substantially
longerto evaluate than the plan it typically chooses for other clients.  The number of tables being joined is in the
regionof 15.  There is an extended statistic object in place to help the one aggregation that occurs (defined on the
basisof the `GROUP BY` columns) to try and get a better estimate of the likely number of rows emitted.  However, what I
amoften seeing in the explain plan is that the estimated rows is small and the actuals are significantly larger e.g.
 

  Merge Join  (cost=1.14..253250.32 rows=1099 width=69) (actual time=1268.587..2400.353 rows=4282355 loops=1)

I am assuming this underestimation is the source of the planner choosing the "wrong" path; in production, we have had
toresort to setting the join and from collapse limits to 1 to force a naive plan to be generated.  This is giving us
executiontimes in the 10/20 second range vs. >45m in some cases.
 

(a) Do you have any suggestions on a general approach to tackling the problem? For example, one option might be to
pre-computesome of the subqueries that are occurring in the transforms, write the results into their own tables, and
substitutethose tables in place of the subqueries in the main transform. Is this something people typically do in this
situation?

(b) Do I need to provide a schema and explain plans to get any concrete advice on how to proceed?

Any advice/suggestions would be much appreciated.

Thanks,
-Joe



В списке pgsql-general по дате отправления:

Предыдущее
От: DAVID ROTH
Дата:
Сообщение: Re: Postgres Equivalent of Oracle Package
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Plans with bad estimates/paths