Обсуждение: Column names: where, group by, having inconsistent behaviour?

Поиск
Список
Период
Сортировка

Column names: where, group by, having inconsistent behaviour?

От
Jan Grant
Дата:
Apologies for raising this - I've found a message in the
archives from about a month ago; but...

I can't find the relevant sections in the SQL spec; nevertheless, the
docs for the SELECT command lead me to believe that this should work,
since:

http://developer.postgresql.org/docs/postgres/sql-select.html
[[
 In the SQL-92 standard, an ORDER BY clause may only use result column
names or numbers, while a GROUP BY clause may only use expressions based
on input column names. PostgreSQL extends each of these clauses to allow
the other choice as well (but it uses the standard's interpretation if
there is ambiguity). PostgreSQL also allows both clauses to specify
arbitrary expressions. Note that names appearing in an expression will
always be taken as input-column names, not as result-column names.
]]

Sure enough:

select col1 as x, col2 as y from table1 order by x;

...works, as does:

select col1 as x, count(col2) as y from table1
    group by x
    having count(col2) = 1;

"Having" operates after "group by"; and I'd argue that

[[
Each column referenced in condition must unambiguously reference a
grouping column, unless the reference appears within an aggregate
function.
]]

so since we can "group by x", one would imagine that

select col1 as x, count(col2) as y from table1
    group by x
    having x = 1;

should work - it's hardly ambiguous.

I appreciate the distinction between input and output columns;
nevertheless, the loosening of the behaviour of "group by" doesn't seem
to sit with the restriction on "having". I also appreciate the need to
support "standard" sql - but in the absence of ambiguities, shouldn't
this expression "do what I mean"?

Cheers,
jan

--
jan grant, ILRT, University of Bristol. http://www.ilrt.bris.ac.uk/
Tel +44(0)117 9287864 or +44 (0)117 9287088 http://ioctl.org/jan/
You see what happens when you have fun with a stranger in the Alps?

Re: Column names: where, group by, having inconsistent behaviour?

От
Tom Lane
Дата:
Jan Grant <Jan.Grant@bristol.ac.uk> writes:
> http://developer.postgresql.org/docs/postgres/sql-select.html
>> ... PostgreSQL also allows both clauses to specify
>> arbitrary expressions. Note that names appearing in an expression will
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> always be taken as input-column names, not as result-column names.
   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

> select col1 as x, count(col2) as y from table1
>     group by x
>     having x = 1;
> should work - it's hardly ambiguous.

It violates the above-underlined restriction though.

In retrospect, the fact that we allow "GROUP BY output-column-name"
was a mistake; it extended an ugly-but-defensible frammish of ORDER BY
into a place where it should never have gone.  ORDER BY is the only one
of the SELECT modifiers that is logically executed after forming the
SELECT output columns, and so it's the only one that should legitimately
have access to their values.

We're kinda stuck with this mistake now on backwards-compatibility
grounds, but I don't much want to spread the mistake any further.

            regards, tom lane