Re: SQL question re aggregates & joins

Поиск

Список

Период

Сортировка

От	Dann Corbit
Тема	Re: SQL question re aggregates & joins
Дата	28 января 2010 г. 21:32:05
Msg-id	D425483C2C5C9F49B5B7A41F8944154702962754@postal.corporate.connx.com обсуждение исходный текст
Ответ на	Re: SQL question re aggregates & joins (Scott Ribe <scott_ribe@killerbytes.com>)
Список	pgsql-general

Дерево обсуждения

> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Scott Ribe
> Sent: Thursday, January 28, 2010 2:10 PM
> To: Thom Brown
> Cc: pgsql-general
> Subject: Re: [GENERAL] SQL question re aggregates & joins
>
> > You can't include an aggregate in the select if you don't group by
> > non-aggregates, so it should be:
> >
> > select max(t1."When"), t1."Pt_Id", t2."DateOfBirth"
> > from "PtStaffAccess" t1, "Person" t2
> > where t1."Pt_Id" = t2.id
> > group by t1."Pt_Id", t2."DateOfBirth";
>
> I was aware that I could alternatively group by all the columns, but
> that
> actually just highlights the redundancy even more--consider your second
> example:
>
> select max(t1."When"), t2.id , t2."DateOfBirth"
> from "PtStaffAccess" t1, "Person" t2
> where t1."Pt_Id" = t2.id
> group by t2.id, t2."DateOfBirth";
>
> Given that t2.id is the primary key, grouping by any other column of t2
> is
> really redundant. I know *what* SQL won't allow me to do, I'm
> interested in
> knowing if there's some reason *why* other than historical...

Having a group by on a primary key in a single table is of no value.  After all, the group by accomplishes nothing at
all.

Having a group by on the primary key of one table in a join does not guarantee uniqueness of the join result.

Consider the simplified schema
Table Customers:
Custname varchar(80)
Custaddr varchar(80)
Custid int primary key
Data:
{'Joe', 'Paris France', 1}
{'Fred', 'Seattle USA', 2}

Table Orders:
OrderID int
Custid int
OrderItem int
OrderQty int
OrderPrice Numeric 12.2
Primary key: OrderID, CustID, OrderItem
Foreign key: CustiID references Customers(CustID)
Data:
{1, 1, 17, 2, 12.92},
{1, 1, 12, 5, 14.75},
{2, 2, 9, 1, 27.45},
{2, 2, 2, 8, 11.23},
{2, 2, 1, 1, 123.45}

SELECT c.Custid, c.CustName, c.Custaddr, o.OrderID, o.OrderItem, sum(o.OrderQty * o.OrderPrice) as dollars
From Customers c, Orders o
WHERE c.Custid = o.Custid
GROUP BY c.Custid

Now, c.Custid is the primary key for Customers and also a foreign key for Orders and yet the query is utter nonsense.
Had all of the primary key columns for the second table been included, then the group by is superfluous.  Having one or
moreof the primary key columns left out from the child table but added to the group by list changes the meaning and
valueof the result. 

I have always thought this way for aggregate queries:
1. Add an aggregate function to each column that you are aggregating in some way.
2. Add a 'group by' for every column that is not being aggregated.
Anything else means that the query has not been thought through carefully.

IMO-YMMV

В списке pgsql-general по дате отправления:

Предыдущее

От: Adrian Klaver
Дата: 28 января 2010 г., 21:29:14
Сообщение: Re: Is there a builtin function for formatting time values?

Следующее

От: Tom Lane
Дата: 28 января 2010 г., 21:36:54
Сообщение: Re: SQL question re aggregates & joins

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: SQL question re aggregates & joins

Предыдущее

Следующее