grouping pushdown

Поиск

Список

Период

Сортировка

От	Spring Zhong
Тема	grouping pushdown
Дата	4 января 2023 г. 13:21:30
Msg-id	SH0PR01MB068246A28146D0AD82D0EDF496F59@SH0PR01MB0682.CHNPR01.prod.partner.outlook.cn обсуждение исходный текст
Ответы	Re: grouping pushdown
Список	pgsql-hackers

Дерево обсуждения

Hi hackers,

I came across a problem on how to improve the performance of queries with GROUP BY clause when the grouping columns have much duplicate data. For example:

create table t1(i1) as select 1 from generate_series(1,10000);

create table t2(i2) as select 2 from generate_series(1,10000);

select i1,i2 from t1, t2 group by i1,i2;

i1 | i2

----+----

1 | 2

QUERY PLAN

-----------------------------------

HashAggregate

Group Key: t1.i1, t2.i2

Batches: 1 Memory Usage: 24kB

-> Nested Loop

-> Seq Scan on t1

-> Materialize

-> Seq Scan on t2

Planning Time: 0.067 ms

Execution Time: 15864.585 ms

The plan is apparently inefficient, since the hash aggregate goes after the Cartesian product. We could expect the query's performance get much improved if the HashAggregate node can be pushed down to the SCAN node. For example, the plan may looks like:

expected QUERY PLAN

----------------------------------------

Group

Group Key: t1.i1, t2.i2

-> Sort

Sort Key: t1.i1, t2.i2

-> Nested Loop

-> HashAggregate

Group Key: t1.i1

-> Seq Scan on t1

-> HashAggregate

Group Key: t2.i2

-> Seq Scan on t2

Moreover, queries with expressions as GROUP BY columns may also take advantage of this feature, e.g.

select i1+i2 from t1, t2 group by i1+i2;

?column?

----------

expected QUERY PLAN

----------------------------------------

Group

Group Key: ((t1.i1 + t2.i2))

-> Sort

Sort Key: ((t1.i1 + t2.i2))

-> Nested Loop

-> HashAggregate

Group Key: t1.i1

-> Seq Scan on t1

-> HashAggregate

Group Key: t2.i2

-> Seq Scan on t2

Is someone has suggestions on this?

В списке pgsql-hackers по дате отправления:

Предыдущее

От: vignesh C
Дата: 04 января 2023 г., 13:17:32
Сообщение: Re: on placeholder entries in view rule action query's range table

Следующее

От: "houzj.fnst@fujitsu.com"
Дата: 04 января 2023 г., 13:55:34
Сообщение: RE: Perform streaming logical transactions by background workers and parallel apply

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

grouping pushdown

Предыдущее

Следующее