Обсуждение: distinct vs group by
Should the expressions
select distinct x from t
and
select x from t group by x
have the same effect?
It seems the optimizer sometimes chooses different plans for those
expressions. Could the select distinct have used the slightly
faster hash aggregate?
Thanks.
fli=# explain select distinct zip from sa_zips;
QUERY PLAN
--------------------------------------------------------------------
Unique (cost=3.65..3.98 rows=66 width=8)
-> Sort (cost=3.65..3.82 rows=66 width=8)
Sort Key: zip
-> Seq Scan on sa_zips (cost=0.00..1.66 rows=66 width=8)
(4 rows)
fli=# explain select zip from sa_zips group by zip;
QUERY PLAN
--------------------------------------------------------------
HashAggregate (cost=1.82..1.82 rows=66 width=8)
-> Seq Scan on sa_zips (cost=0.00..1.66 rows=66 width=8)
(2 rows)
On Fri, Jan 28, 2005 at 02:35:21 -0800, Ron Mayer <rm_pg@cheapcomplexdevices.com> wrote: > > Should the expressions > select distinct x from t > and > select x from t group by x > have the same effect? > > It seems the optimizer sometimes chooses different plans for those > expressions. Could the select distinct have used the slightly > faster hash aggregate? I don't think select distinct can use hash aggregate plans.
Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:
> Should the expressions
> select distinct x from t
> and
> select x from t group by x
> have the same effect?
Offhand I think they are theoretically equivalent.
> It seems the optimizer sometimes chooses different plans for those
> expressions. Could the select distinct have used the slightly
> faster hash aggregate?
This is partly historical (the DISTINCT code hasn't been rewritten in a
long time) and partly intentional --- you can choose one phrasing or the
other to control what plan you get.
regards, tom lane