Обсуждение: Why can't I select un-grouped columns when grouping by a (non-primary) unique key?

Поиск
Список
Период
Сортировка

Why can't I select un-grouped columns when grouping by a (non-primary) unique key?

От
Daniel Lenski
Дата:
If I include the primary key of a table in my GROUP BY clause, PG 9.3 allows me to refer to other columns of that table without explicit GROUP BY:

    CREATE TABLE A (id SERIAL PRIMARY KEY, name TEXT UNIQUE NOT NULL, document JSON);

    -- this works fine
    SELECT A.document
    FROM A
    GROUP BY A.primary_key

Why doesn't the same thing work with a non-NULL unique constraint?

    -- ERROR: column "A.document" must appear in the GROUP BY clause or be used in an aggregate function
    SELECT A.document
    FROM A
    GROUP BY A.name

I got thinking about this distinction because I wrote some very ugly SQL in a few cases, to get around the lack of JSON comparison operators in PG 9.3, before I discovered that it would work if I used the PRIMARY KEY instead:

    -- this works but it's ugly
    SELECT A.document::text::json
    FROM table
    GROUP BY A.non_null_unique_key, A.document::text

The manual refers to this situation (http://www.postgresql.org/docs/9.3/static/sql-select.html#SQL-GROUPBY), but I don't understand whether there's a specific reason to distinguish primary keys from non-NULL unique constraints.

Thanks,
Dan Lenski

Re: Why can't I select un-grouped columns when grouping by a (non-primary) unique key?

От
Alberto Cabello Sánchez
Дата:
On Wed, 24 Sep 2014 09:04:21 -0700
Daniel Lenski <dlenski@gmail.com> wrote:

> If I include the primary key of a table in my GROUP BY clause, PG 9.3
> allows me to refer to other columns of that table without explicit GROUP BY:
>
> Why doesn't the same thing work with a non-NULL unique constraint?

At first sight, primary key means no grouping at all, as there are no
duplicated A.primary_key values:

  SELECT A.document
    FROM A
    GROUP BY A.primary_key

is the same as

  SELECT A.document
    FROM A


--
Alberto Cabello Sánchez
<alberto@unex.es>


Re: Why can't I select un-grouped columns when grouping by a (non-primary) unique key?

От
Geoff Montee
Дата:


On Wed, Sep 24, 2014 at 1:37 PM, Alberto Cabello Sánchez <alberto@unex.es> wrote:
>
> On Wed, 24 Sep 2014 09:04:21 -0700
> Daniel Lenski <dlenski@gmail.com> wrote:
>
> > If I include the primary key of a table in my GROUP BY clause, PG 9.3
> > allows me to refer to other columns of that table without explicit GROUP BY:
> >
> > Why doesn't the same thing work with a non-NULL unique constraint?
>
> At first sight, primary key means no grouping at all, as there are no
> duplicated A.primary_key values:
>
>   SELECT A.document
>     FROM A
>     GROUP BY A.primary_key
>
> is the same as
>
>   SELECT A.document
>     FROM A

I believe this blog post contains better examples of the feature he's referring to:

http://www.depesz.com/2010/08/08/waiting-for-9-1-recognize-functional-dependency-on-primary-keys/

For example:

SELECT
    p.id,
    p.firstname,
    p.lastname,
    count(*)
FROM
    people p
    JOIN visits v on p.id = v.person_id
GROUP BY p.id;


Geoff

Re: Why can't I select un-grouped columns when grouping by a (non-primary) unique key?

От
Daniel Lenski
Дата:
On Wed, Sep 24, 2014 at 10:46 AM, Geoff Montee <geoff.montee@gmail.com> wrote:
>
> I believe this blog post contains better examples of the feature he's
> referring to:
>
> http://www.depesz.com/2010/08/08/waiting-for-9-1-recognize-functional-dependency-on-primary-keys/
>
> For example:
>
> SELECT
>     p.id,
>     p.firstname,
>     p.lastname,
>     count(*)
> FROM
>     people p
>     JOIN visits v on p.id = v.person_id
> GROUP BY p.id;
>

Geoff, that's exactly the feature I'm referring to. I see that the
inclusion of UNIQUE NOT NULL constraints was recognized as a logical
next step when this feature was introduced.

Now that I understand PG's current behavior, it doesn't seem like a
huge limitation... but I'm curious about what is preventing the UNIQUE
NOT NULL constraints from being allowed as well. Is there something
different about the internal representation of UNIQUE NOT NULL
constraints compared to PRIMARY KEY constraints?

Thanks,
Dan


Re: Why can't I select un-grouped columns when grouping by a (non-primary) unique key?

От
Daniel Lenski
Дата:
On Wed, Sep 24, 2014 at 10:37 AM, Alberto Cabello Sánchez
<alberto@unex.es> wrote:
> At first sight, primary key means no grouping at all, as there are no
> duplicated A.primary_key values:
>
>   SELECT A.document
>     FROM A
>     GROUP BY A.primary_key
>
> is the same as
>
>   SELECT A.document
>     FROM A
>

Yes, my example is oversimplified, because GROUP BY primary_key has no
semantic effect on a single-table query, although it still illustrates
the different behavior for PRIMARY KEY vs. UNIQUE NOT NULL, which
should be logically equivalent.

The examples that Geoff Montee gave are better because (GROUP BY
primary_key) does change the meaning of the query when there's a join
to another table.

Dan


Re: Why can't I select un-grouped columns when grouping by a (non-primary) unique key?

От
Tom Lane
Дата:
Daniel Lenski <dlenski@gmail.com> writes:
> Now that I understand PG's current behavior, it doesn't seem like a
> huge limitation... but I'm curious about what is preventing the UNIQUE
> NOT NULL constraints from being allowed as well. Is there something
> different about the internal representation of UNIQUE NOT NULL
> constraints compared to PRIMARY KEY constraints?

The comments for check_functional_grouping() explain where the holdup is:

 * Determine whether a relation can be proven functionally dependent on
 * a set of grouping columns.  If so, return TRUE and add the pg_constraint
 * OIDs of the constraints needed for the proof to the *constraintDeps list.
 *
 * grouping_columns is a list of grouping expressions, in which columns of
 * the rel of interest are Vars with the indicated varno/varlevelsup.
 *
 * Currently we only check to see if the rel has a primary key that is a
 * subset of the grouping_columns.  We could also use plain unique constraints
 * if all their columns are known not null, but there's a problem: we need
 * to be able to represent the not-null-ness as part of the constraints added
 * to *constraintDeps.  FIXME whenever not-null constraints get represented
 * in pg_constraint.

            regards, tom lane