Обсуждение: DISTINCT -> GROUP BY

Поиск
Список
Период
Сортировка

DISTINCT -> GROUP BY

От
David Fetter
Дата:
Folks,

I've noticed that queries of the form

SELECT DISTNCT foo, bar, baz
FROM quux
WHERE ...

perform significantly worse than the equivalent using GROUP BY.

SELECT foo, bar, baz
FROM quux
WHERE ...
GROUP BY foo, bar, baz

Where would I start looking in order to make them actually equivalent
from the planner's point of view?  Also, would back-patching this make
sense?  It doesn't change any APIs, but it does make some queries go
faster.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: DISTINCT -> GROUP BY

От
"Pavel Stehule"
Дата:
Hello David

http://www.postgresql.org/docs/faqs.TODO.html

Consider using hash buckets to do DISTINCT, rather than sorting
This would be beneficial when there are few distinct values. This is
already used by GROUP BY.

Regards
Pavel Stehule

2008/6/3 David Fetter <david@fetter.org>:
> Folks,
>
> I've noticed that queries of the form
>
> SELECT DISTNCT foo, bar, baz
> FROM quux
> WHERE ...
>
> perform significantly worse than the equivalent using GROUP BY.
>
> SELECT foo, bar, baz
> FROM quux
> WHERE ...
> GROUP BY foo, bar, baz
>
> Where would I start looking in order to make them actually equivalent
> from the planner's point of view?  Also, would back-patching this make
> sense?  It doesn't change any APIs, but it does make some queries go
> faster.
>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: DISTINCT -> GROUP BY

От
David Fetter
Дата:
On Tue, Jun 03, 2008 at 03:36:44PM +0200, Pavel Stehule wrote:
> Hello David
> 
> http://www.postgresql.org/docs/faqs.TODO.html
> 
> Consider using hash buckets to do DISTINCT, rather than sorting This
> would be beneficial when there are few distinct values. This is
> already used by GROUP BY.

It's nice to see that this is kinda on the TODO, but it doesn't
address the question I asked, which is, "how would I get the planner
to rewrite DISTINCTs as the equivalent GROUP BYs?" :)

Any hints?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: DISTINCT -> GROUP BY

От
"Pavel Stehule"
Дата:
2008/6/3 David Fetter <david@fetter.org>:
> On Tue, Jun 03, 2008 at 03:36:44PM +0200, Pavel Stehule wrote:
>> Hello David
>>
>> http://www.postgresql.org/docs/faqs.TODO.html
>>
>> Consider using hash buckets to do DISTINCT, rather than sorting This
>> would be beneficial when there are few distinct values. This is
>> already used by GROUP BY.
>
> It's nice to see that this is kinda on the TODO, but it doesn't
> address the question I asked, which is, "how would I get the planner
> to rewrite DISTINCTs as the equivalent GROUP BYs?" :)

you can't to do it :(

Pavel

>
> Any hints?
>
> Cheers,
> David.
> --
> David Fetter <david@fetter.org> http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fetter@gmail.com
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>


Re: DISTINCT -> GROUP BY

От
Decibel!
Дата:
On Jun 3, 2008, at 9:03 AM, David Fetter wrote:
> On Tue, Jun 03, 2008 at 03:36:44PM +0200, Pavel Stehule wrote:
>> Hello David
>>
>> http://www.postgresql.org/docs/faqs.TODO.html
>>
>> Consider using hash buckets to do DISTINCT, rather than sorting This
>> would be beneficial when there are few distinct values. This is
>> already used by GROUP BY.
>
> It's nice to see that this is kinda on the TODO, but it doesn't
> address the question I asked, which is, "how would I get the planner
> to rewrite DISTINCTs as the equivalent GROUP BYs?" :)

IIRC, the patch to allow GROUP BY to do hashing is fairly recent...  
you could probably find it in archives.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828