Wei Weng wrote:
> Josh Berkus wrote:
>
>>Wei Wang,
>>
>>
>>
>>>How exactly slow is DISTINCT being processed in SQL engines? (not
>>>limited to postgresql, though comments on postgresql would be most
>>>relevant)
>>>
>>>
>>I can only give you a relative result, based exlusively on my anecdotal
>> experience with 7.1:
>>
>>Fast: SELECT ...
>>Slower: SELECT ... GROUP BY x,y,z
>> or: SELECT DISCTINCT ON (x) ... (Postgres non-standard extension)
>>SLowest: SELECT DISTINCT ...
>>
>>The reason for this is that SELECT DISTINCT is effectively a GROUP BY
>> on all result fields of the query, and if a few of the aren't indexed
>> that requires a seq scan.
>>
> What if I do thing like
>
> SELECT DISTINCT table1.tid, table1.name, table1.description FROM ...
>
> Does it equal to the scenario 2 or 3?
>
> I am thinking SELECT DISTINCT table1.tid is just a variation (or the
> other way around) of SELECT DISTINCT ON (table1.tid), is that right?
>
> Thanks
>
>
Of course, a more basic question is, why so much repeated data?
Perhaps you should look at the design in your tables. I'm not assuming
they are wrong, only suggesting that you look.