Michael Glaesemann <grzm@seespotcode.net> writes:
> On Feb 15, 2007, at 22:35 , Richard Huxton wrote:
>> Walter Cruz wrote:
>>> The error: ERROR: for SELECT DISTINCT, ORDER BY expressions must
>>> appear in select list is due to a standart implementarion or a design
>>> decision of postgres?
>>
>> I think ORDER BY is defined to take place after DISTINCT, which
>> means there is no meaningful "number" for it to order by. You could
>> arbitrarily choose the first number encountered, but I can't see
>> what sense it would make to order by them.
> I believe the reason is that DISTINCT depends on a sort to determine
> uniqueness (distinctness), so it's a implementation detail that if
> you're going to include an ORDER BY, you also need to include the
> same columns in the ORDER BY in the DISTINCT clause.
No, there's actually a definitional reason for it. Consider
SELECT DISTINCT x FROM tab ORDER BY y;
For any particular x-value in the table there might be many different y
values. Which one will you use to sort that x-value in the output?
Back in SQL92 they avoided this problem by specifying that ORDER BY
entries had to reference output columns. SQL99 has some messy verbiage
that I think comes out at the same place as our restriction:
A) If K(i) is not equivalent to a <value expression> immediately contained in any
<derivedcolumn> in the <select list> SL of <query specification> QS contained
inQE, then:
I) T shall not be a grouped table.
II) QS shall not specify the <set quantifier> DISTINCT or directly contain
oneor more <set function specification>s.
regards, tom lane