No partition pruning when initializing query plan with LATERAL JOINand aggregates

Поиск

Список

Период

Сортировка

От	Marcin Barczyński
Тема	No partition pruning when initializing query plan with LATERAL JOINand aggregates
Дата	13 января 2020 г. 18:42:18
Msg-id	CAP3o3PdcYTbHHX9sz7K6EsWnf2L64omM_de6r1eana0=go3YEw@mail.gmail.com обсуждение исходный текст
Список	pgsql-general

Дерево обсуждения

I wonder why partition pruning doesn't work with LATERAL JOIN and aggregates.
Below is my example tested on PostgreSQL 12.1 (Ubuntu 12.1-1.pgdg18.04+1):

CREATE TABLE demo(key BIGINT) PARTITION BY LIST (key);
CREATE TABLE demo_key_1 PARTITION OF demo FOR VALUES IN (1);
CREATE TABLE demo_key_2 PARTITION OF demo FOR VALUES IN (2);
INSERT INTO demo(key) VALUES (1), (2);
ANALYZE demo;

CREATE TABLE demo2(key BIGINT) PARTITION BY LIST (key);
CREATE TABLE demo2_key_1 PARTITION OF demo2 FOR VALUES IN (1);
CREATE TABLE demo2_key_2 PARTITION OF demo2 FOR VALUES IN (2);
INSERT INTO demo2(key) VALUES (1), (2);
ANALYZE demo2;

Now, if there are no aggregates in SELECT under LATERAL JOIN, everything works as expected - only a single partition of each table is scanned:

EXPLAIN ANALYZE
SELECT * FROM demo
JOIN LATERAL (
   SELECT key AS key2
   FROM demo2
   WHERE demo2.key = demo.key
) d ON TRUE
WHERE demo.key = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2.03 rows=1 width=16) (actual time=0.007..0.008 rows=1 loops=1)
-> Seq Scan on demo_key_1 (cost=0.00..1.01 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=1)
Filter: (key = 1)
-> Seq Scan on demo2_key_1 (cost=0.00..1.01 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=1)
Filter: (key = 1)
Planning Time: 0.191 ms
Execution Time: 0.025 ms
(7 rows)

However, when I try a very similar query that contains an aggregate function, partitions of demo2 are not pruned from the query plan:

EXPLAIN ANALYZE
SELECT * FROM demo
JOIN LATERAL (
   SELECT sum(demo2.key) AS sum2
   FROM demo2
   WHERE demo2.key = demo.key
) d ON TRUE
WHERE demo.key = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=2.03..3.07 rows=1 width=40) (actual time=0.018..0.018 rows=1 loops=1)
-> Seq Scan on demo_key_1 (cost=0.00..1.01 rows=1 width=8) (actual time=0.005..0.005 rows=1 loops=1)
Filter: (key = 1)
-> Aggregate (cost=2.03..2.04 rows=1 width=32) (actual time=0.011..0.011 rows=1 loops=1)
-> Append (cost=0.00..2.03 rows=2 width=8) (actual time=0.004..0.005 rows=1 loops=1)
-> Seq Scan on demo2_key_1 (cost=0.00..1.01 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)
Filter: (key = demo_key_1.key)
-> Seq Scan on demo2_key_2 (cost=0.00..1.01 rows=1 width=8) (never executed)
Filter: (key = demo_key_1.key)
Planning Time: 0.174 ms
Execution Time: 0.082 ms
(11 rows)

Of course, Seq Scan on demo2_key_2 was never executed, but why wasn't it pruned from the query plan? More complex queries with hundreds of partitions are affected badly by that.

The workaround is to add a redundant condition to the subquery:

EXPLAIN ANALYZE
SELECT * FROM demo
JOIN LATERAL (
   SELECT sum(demo2.key) AS sum2
   FROM demo2
   WHERE demo2.key = 1 AND demo2.key = demo.key
) d ON TRUE
WHERE demo.key = 1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1.01..2.05 rows=1 width=40) (actual time=0.011..0.011 rows=1 loops=1)
-> Seq Scan on demo_key_1 (cost=0.00..1.01 rows=1 width=8) (actual time=0.004..0.004 rows=1 loops=1)
Filter: (key = 1)
-> Aggregate (cost=1.01..1.02 rows=1 width=32) (actual time=0.006..0.006 rows=1 loops=1)
-> Result (cost=0.00..1.01 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)
One-Time Filter: (demo_key_1.key = 1)
-> Seq Scan on demo2_key_1 (cost=0.00..1.01 rows=1 width=8) (actual time=0.001..0.002 rows=1 loops=1)
Filter: (key = 1)
Planning Time: 0.079 ms
Execution Time: 0.031 ms
(10 rows)

--
M.B.

В списке pgsql-general по дате отправления:

Предыдущее

От: dagamier
Дата: 13 января 2020 г., 18:41:06
Сообщение: Re: pg_repack in cluster

Следующее

От: Adrian Klaver
Дата: 13 января 2020 г., 21:36:17
Сообщение: Re: pg_repack in cluster

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

No partition pruning when initializing query plan with LATERAL JOINand aggregates

Предыдущее

Следующее