Re: BUG #16053: Query planner performance regression in sql querywith multiple CTE in v12
От | Federico |
---|---|
Тема | Re: BUG #16053: Query planner performance regression in sql querywith multiple CTE in v12 |
Дата | |
Msg-id | CAN19dyfAuC+86gKFemQmBrAP0GO9g8VSkk66VrBVnVZnuWGv+w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #16053: Query planner performance regression in sql query with multiple CTE in v12 (Andrew Gierth <andrew@tao11.riddles.org.uk>) |
Список | pgsql-bugs |
Sorry for the late reply and for not including the query in the first message.
Is there a way to avoid the mis-estimate other than changing the query?
Following is the query
--explain analyse
WITH cte_A AS
(SELECT inv_y.cid AS cid,
inv_y.pid AS pid,
sum(inv_y.value) AS value
FROM inv_y
WHERE inv_y.y = 1
GROUP BY inv_y.cid,
inv_y.pid),
cte_B AS
(SELECT inv_y.cid AS cid,
inv_y.pid AS pid
FROM inv_y
WHERE inv_y.y = 1),
cte_C AS
(SELECT sum(inv.value) AS value,
inv.cid AS cid,
inv.pid AS pid
FROM inv
JOIN cte_B ON cte_B.cid = inv.cid
AND cte_B.pid = inv.pid
WHERE inv.invd >= '2017-01-01'
AND inv.invd <= '2018-12-31'
GROUP BY inv.cid,
inv.pid),
cte_D AS
(SELECT cte_C.cid AS cid,
cte_C.pid AS pid,
sum(cte_C.value) AS value
FROM cte_C
GROUP BY cte_C.cid,
cte_C.pid),
cte_E AS
(SELECT sum(inv.value) AS value,
inv.cid AS cid,
inv.pid AS pid
FROM inv
JOIN cte_B ON cte_B.cid = inv.cid
AND cte_B.pid = inv.pid
WHERE inv.invd >= '2018-01-01'
AND inv.invd <= '2018-06-30'
GROUP BY inv.cid,
inv.pid),
cte_F AS
(SELECT cte_B.cid AS cid,
cte_B.pid AS pid,
coalesce(sum(cte_E.value), 0) AS value
FROM cte_B
LEFT OUTER JOIN cte_E ON cte_E.cid = cte_B.cid
AND cte_E.pid = cte_B.pid
GROUP BY cte_B.cid,
cte_B.pid),
cte_G AS
(SELECT sum(expe.value) AS value,
expe.cid AS cid,
expe.pid AS pid
FROM expe
JOIN cte_B ON cte_B.cid = expe.cid
AND cte_B.pid = expe.pid
WHERE expe.invd >= '2018-07-01'
AND expe.invd <= '2018-12-31'
GROUP BY expe.cid,
expe.pid),
cte_H AS
(SELECT cte_B.cid AS cid,
cte_B.pid AS pid,
coalesce(sum(cte_G.value), 0) AS value
FROM cte_B
LEFT OUTER JOIN cte_G ON cte_G.cid = cte_B.cid
AND cte_G.pid = cte_B.pid
GROUP BY cte_B.cid,
cte_B.pid)
SELECT cte_D.value AS ref_value,
cte_D.value - (cte_F.value + cte_H.value) AS rem_value
FROM cte_A
JOIN cte_D ON cte_A.cid = cte_D.cid
AND cte_A.pid = cte_D.pid
JOIN cte_F ON cte_A.cid = cte_F.cid
AND cte_A.pid = cte_F.pid
JOIN cte_H ON cte_A.cid = cte_H.cid
AND cte_A.pid = cte_H.pid
--join cte_B ON cte_A.cid = cte_B.cid
--AND cte_A.pid = cte_B.pid
WITH cte_A AS
(SELECT inv_y.cid AS cid,
inv_y.pid AS pid,
sum(inv_y.value) AS value
FROM inv_y
WHERE inv_y.y = 1
GROUP BY inv_y.cid,
inv_y.pid),
cte_B AS
(SELECT inv_y.cid AS cid,
inv_y.pid AS pid
FROM inv_y
WHERE inv_y.y = 1),
cte_C AS
(SELECT sum(inv.value) AS value,
inv.cid AS cid,
inv.pid AS pid
FROM inv
JOIN cte_B ON cte_B.cid = inv.cid
AND cte_B.pid = inv.pid
WHERE inv.invd >= '2017-01-01'
AND inv.invd <= '2018-12-31'
GROUP BY inv.cid,
inv.pid),
cte_D AS
(SELECT cte_C.cid AS cid,
cte_C.pid AS pid,
sum(cte_C.value) AS value
FROM cte_C
GROUP BY cte_C.cid,
cte_C.pid),
cte_E AS
(SELECT sum(inv.value) AS value,
inv.cid AS cid,
inv.pid AS pid
FROM inv
JOIN cte_B ON cte_B.cid = inv.cid
AND cte_B.pid = inv.pid
WHERE inv.invd >= '2018-01-01'
AND inv.invd <= '2018-06-30'
GROUP BY inv.cid,
inv.pid),
cte_F AS
(SELECT cte_B.cid AS cid,
cte_B.pid AS pid,
coalesce(sum(cte_E.value), 0) AS value
FROM cte_B
LEFT OUTER JOIN cte_E ON cte_E.cid = cte_B.cid
AND cte_E.pid = cte_B.pid
GROUP BY cte_B.cid,
cte_B.pid),
cte_G AS
(SELECT sum(expe.value) AS value,
expe.cid AS cid,
expe.pid AS pid
FROM expe
JOIN cte_B ON cte_B.cid = expe.cid
AND cte_B.pid = expe.pid
WHERE expe.invd >= '2018-07-01'
AND expe.invd <= '2018-12-31'
GROUP BY expe.cid,
expe.pid),
cte_H AS
(SELECT cte_B.cid AS cid,
cte_B.pid AS pid,
coalesce(sum(cte_G.value), 0) AS value
FROM cte_B
LEFT OUTER JOIN cte_G ON cte_G.cid = cte_B.cid
AND cte_G.pid = cte_B.pid
GROUP BY cte_B.cid,
cte_B.pid)
SELECT cte_D.value AS ref_value,
cte_D.value - (cte_F.value + cte_H.value) AS rem_value
FROM cte_A
JOIN cte_D ON cte_A.cid = cte_D.cid
AND cte_A.pid = cte_D.pid
JOIN cte_F ON cte_A.cid = cte_F.cid
AND cte_A.pid = cte_F.pid
JOIN cte_H ON cte_A.cid = cte_H.cid
AND cte_A.pid = cte_H.pid
--join cte_B ON cte_A.cid = cte_B.cid
--AND cte_A.pid = cte_B.pid
On Sat, 12 Oct 2019 at 19:17, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>>>>> "PG" == PG Bug reporting form <noreply@postgresql.org> writes:
PG> After upgrading to v12 from v11.4 I've noticed one of the queries
PG> run by an application has a significant performance degradation.
PG> I've managed to create a toy db that illustrates the issue.
PG> Following is a link of the db and the query in question (there
PG> seems to be no way to attach a file from the web form. If I should
PG> upload them somewhere else let me know)
PG> https://www.dropbox.com/s/6n3ctdn2zohoann/toy-db_and_query.zip?dl=0
You could at least have included the query in the message.
What seems to be happening here looks to me more down to chance than to
any bug. In v11, the join order is constrained by the CTEs, whereas in
v12 it is not; the misplan is the result of a mis-estimated 1-row result
from a subpath that may not have been considered under v11.
--
Andrew (irc:RhodiumToad)
В списке pgsql-bugs по дате отправления: