Обсуждение: About reducing EXISTS sublink

Поиск

Список

Период

Сортировка

About reducing EXISTS sublink

От

Richard Guo

Дата:

22 мая 2020 г., 08:26:20

Hi hackers,

For EXISTS SubLink, in some cases the subquery can be reduced to
constant TRUE or FALSE, based on the knowledge that it's being used in
EXISTS(). One such case is when the subquery has aggregates without
GROUP BY or HAVING, and we know its result is exactly one row, unless
that row is discarded by LIMIT/OFFSET. (Greenplum does this.)

For example:

# explain (costs off) select * from a where exists
(select avg(i) from b where a.i = b.i);
QUERY PLAN
-----------------------------------
Seq Scan on a
Filter: (SubPlan 1)
SubPlan 1
-> Aggregate
-> Seq Scan on b
Filter: (a.i = i)
(6 rows)

This query can be reduced to:

# explain (costs off) select * from a where exists
(select avg(i) from b where a.i = b.i);
QUERY PLAN
---------------
Seq Scan on a
(1 row)

And likewise, for this query below:

# explain (costs off) select * from a where exists
(select avg(i) from b where a.i = b.i offset 1);
QUERY PLAN
-----------------------------------------
Seq Scan on a
Filter: (SubPlan 1)
SubPlan 1
-> Limit
-> Aggregate
-> Seq Scan on b
Filter: (a.i = i)
(7 rows)

It can be reduced to:

# explain (costs off) select * from a where exists
(select avg(i) from b where a.i = b.i offset 1);
QUERY PLAN
--------------------------
Result
One-Time Filter: false
(2 rows)

Is it worthwhile to add some codes for such optimization? If so, I can
try to propose a patch.

Thanks

Richard

Re: About reducing EXISTS sublink

От

"David G. Johnston"

Дата:

22 мая 2020 г., 14:59:51

On Friday, May 22, 2020, Richard Guo <guofenglinux@gmail.com> wrote:

Hi hackers,

For EXISTS SubLink, in some cases the subquery can be reduced to
constant TRUE or FALSE, based on the knowledge that it's being used in
EXISTS(). One such case is when the subquery has aggregates without
GROUP BY or HAVING, and we know its result is exactly one row, unless
that row is discarded by LIMIT/OFFSET. (Greenplum does this.)

Is it worthwhile to add some codes for such optimization? If so, I can
try to propose a patch

While the examples clearly demonstrate what you are saying they don’t provide any motivation to do anything about it - adding aggregates and offset to an exists subquery seems like poor query design that should be fixed by the query writer not by spending hacker cycles optimizing it.

David J.

Re: About reducing EXISTS sublink

От

Richard Guo

Дата:

26 мая 2020 г., 02:23:22

On Fri, May 22, 2020 at 10:59 PM David G. Johnston <david.g.johnston@gmail.com> wrote:

On Friday, May 22, 2020, Richard Guo <guofenglinux@gmail.com> wrote:
Hi hackers,

For EXISTS SubLink, in some cases the subquery can be reduced to
constant TRUE or FALSE, based on the knowledge that it's being used in
EXISTS(). One such case is when the subquery has aggregates without
GROUP BY or HAVING, and we know its result is exactly one row, unless
that row is discarded by LIMIT/OFFSET. (Greenplum does this.)

Is it worthwhile to add some codes for such optimization? If so, I can
try to propose a patch

While the examples clearly demonstrate what you are saying they don’t provide any motivation to do anything about it - adding aggregates and offset to an exists subquery seems like poor query design that should be fixed by the query writer not by spending hacker cycles optimizing it.

Thanks David. I have the same concern that this case is not common

enough to worth hacker cycles.

Thanks

Richard

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: About reducing EXISTS sublink

About reducing EXISTS sublink

Re: About reducing EXISTS sublink

Re: About reducing EXISTS sublink