Обсуждение: What's wrong with this group by clause?
Hi everyone. Below you can find a simplified example of a real case. I don't understand why I'm getting the "john" record twice. I know there's no point in using constants in the group by, but makes sense in the real much bigger query. What I see is that when I issue an EXPLAIN with this query, it tells me that the SORT KEY is "0". When I remove one of the constant fields and try again the EXPLAIN, it says that the SORT KEY is "0, name" (and the group by works, it shows only one record for each person in the table). Thanks anyone. /*EXAMPLE*/ CREATE TABLE people ( name TEXT ); INSERT INTO people VALUES ('john'); INSERT INTO people VALUES ('john'); INSERT INTO people VALUES ('pete'); INSERT INTO people VALUES ('pete'); INSERT INTO people VALUES ('ernest'); INSERT INTO people VALUES ('john'); SELECT 0 AS field1, 0 AS field2, name FROM people GROUP BY field1, field2, name; field1 | field2 | name --------+--------+-------- 0 | 0 | john 0 | 0 | pete 0 | 0 | ernest 0 | 0 | john (4 rows)
> Hi everyone.=20 > > Below you can find a simplified example of a real case.=20 > I don't understand why I'm getting the "john" record twice.=20 > I know there's no point in using constants in the group by, but makes sense= > in=20 > the real much bigger query.=20 > What I see is that when I issue an EXPLAIN with this query, it tells me tha= > t=20 > the SORT KEY is "0".=20 > When I remove one of the constant fields and try again the EXPLAIN, it says= > =20 > that the SORT KEY is "0, name" (and the group by works, it shows only one= > =20 > record for each person in the table). > > Thanks anyone. > > /*EXAMPLE*/ > CREATE TABLE people > ( > name TEXT > ); > INSERT INTO people VALUES ('john'); > INSERT INTO people VALUES ('john'); > INSERT INTO people VALUES ('pete'); > INSERT INTO people VALUES ('pete'); > INSERT INTO people VALUES ('ernest'); > INSERT INTO people VALUES ('john'); > =20=20=20 > SELECT > 0 AS field1, > 0 AS field2,=20 > name > FROM > people > GROUP BY > field1, > field2, > name; > I did your example and did not get the "john" record twice. Instead I got what you expected.field1 | field2 | name --------+--------+-------- 0 | 0 | ernest 0 | 0 | john 0 | 0 | pete (3 rows) What version are you using? Regards, Christoph
[forwarding to -hackers] On Tue, 4 Nov 2003 18:28:12 -0300, Franco Bruno Borghesi <franco@akyasociados.com.ar> wrote: >Below you can find a simplified example of a real case. >I don't understand why I'm getting the "john" record twice. ISTM you have found a Postgres 7.3 bug. I get one john withPostgreSQL 7.1.3 on i686-pc-cygwin, compiled by GCC 2.95.3-5 andPostgreSQL 7.2 on i686-pc-linux-gnu, compiled by GCC 2.7.2.1 but two johns withPostgreSQL 7.3.2 on i686-pc-linux-gnu, compiled by GCC 2.7.2.1 >/*EXAMPLE*/ >CREATE TABLE people >( > name TEXT >); >INSERT INTO people VALUES ('john'); >INSERT INTO people VALUES ('john'); >INSERT INTO people VALUES ('pete'); >INSERT INTO people VALUES ('pete'); >INSERT INTO people VALUES ('ernest'); >INSERT INTO people VALUES ('john'); > >SELECT > 0 AS field1, > 0 AS field2, > name >FROM > people >GROUP BY > field1, > field2, > name; > > field1 | field2 | name >--------+--------+-------- > 0 | 0 | john > 0 | 0 | pete > 0 | 0 | ernest > 0 | 0 | john >(4 rows) Same forSELECT 0 AS field1, 0 AS field2, name FROM people GROUP BY 1, 2, name; ServusManfred
Manfred Koizar <mkoi-pg@aon.at> writes: > ISTM you have found a Postgres 7.3 bug. Yeah. Actually, the planner bug has been there a long time, but it was only latent until the parser stopped suppressing duplicate GROUP BY items: 2002-08-18 14:46 tgl * src/backend/parser/parse_clause.c: Remove optimization wherebyparser would make only one sort-list entry when two equal()targetlistitems were to be added to an ORDER BY or DISTINCT list. Although indeed this would make sorting fractionallyfaster bysometimes saving a comparison, it confuses the heck out of laterstages of processing, because it makesit look like the user wroteDISTINCT ON rather than DISTINCT. Bug reported byjoe@piscitella.com. 7.3 patch is attached if you need it. regards, tom lane *** src/backend/optimizer/plan/planner.c.orig Wed Mar 5 13:38:26 2003 --- src/backend/optimizer/plan/planner.c Thu Mar 13 11:21:16 2003 *************** *** 1498,1510 **** * are just dummies with no extra execution cost.) */ List *sort_tlist= new_unsorted_tlist(subplan->targetlist); int keyno = 0; List *gl; foreach(gl, groupClause) { GroupClause *grpcl = (GroupClause *) lfirst(gl); ! TargetEntry *te = nth(grpColIdx[keyno] - 1, sort_tlist); Resdom *resdom = te->resdom; /* --- 1498,1511 ---- * are just dummies with no extra execution cost.) */ List *sort_tlist= new_unsorted_tlist(subplan->targetlist); + int grpno = 0; int keyno = 0; List *gl; foreach(gl, groupClause) { GroupClause *grpcl = (GroupClause *) lfirst(gl); ! TargetEntry *te = nth(grpColIdx[grpno] - 1, sort_tlist); Resdom *resdom = te->resdom; /* *************** *** 1518,1523 **** --- 1519,1525 ---- resdom->reskey = ++keyno; resdom->reskeyop = grpcl->sortop; } + grpno++; } Assert(keyno > 0);
From: Franco Bruno Borghesi <franco@akyasociados.com.ar> >SELECT> 0 AS field1, > 0 AS field2, >name >FROM> people >GROUP BY> field1, > field2, >name; I think the problem is that you don't have a column to group on. Try adding SELECT ....,count(*).... so that there is an aggregate of some kind. Alternatively, you could use DISTINCT ON (field1,field2) field1,field2,name FROM ... although this is a Postgres specific extension of the SQL spec. Len Morgan
On Thu, 13 Mar 2003 01:34:34 -0600, "Len Morgan" <len-morgan@crcom.net> wrote: >>GROUP BY > > field1, > > field2, > >name; >I think the problem is that you don't have a column to group on. field1, field2, and name are the grouping columns. >Try adding >SELECT ....,count(*).... so that there is an aggregate of some kind. You don't need an aggregate in a GROUP BY query. A SELECT ... GROUP BY without any aggregate behaves like SELECT DISTINCT. There's nothing wrong with it. Performance might be a different story. BTW, Franco's problem has been recognised as a bug and a patch has already been published (cf. Tom Lane's mail in this thread). ServusManfred
Thanks Tom, I applied the patch and it works perfect now. Thanks to you all. On Thursday 13 March 2003 14:02, Tom Lane wrote: > Manfred Koizar <mkoi-pg@aon.at> writes: > > ISTM you have found a Postgres 7.3 bug. > > Yeah. Actually, the planner bug has been there a long time, but it was > only latent until the parser stopped suppressing duplicate GROUP BY > items: > > 2002-08-18 14:46 tgl > > * src/backend/parser/parse_clause.c: Remove optimization whereby > parser would make only one sort-list entry when two equal() > targetlist items were to be added to an ORDER BY or DISTINCT list. > Although indeed this would make sorting fractionally faster by > sometimes saving a comparison, it confuses the heck out of later > stages of processing, because it makes it look like the user wrote > DISTINCT ON rather than DISTINCT. Bug reported by > joe@piscitella.com. > > 7.3 patch is attached if you need it. > > regards, tom lane > > > *** src/backend/optimizer/plan/planner.c.orig Wed Mar 5 13:38:26 2003 > --- src/backend/optimizer/plan/planner.c Thu Mar 13 11:21:16 2003 > *************** > *** 1498,1510 **** > * are just dummies with no extra execution cost.) > */ > List *sort_tlist = new_unsorted_tlist(subplan->targetlist); > int keyno = 0; > List *gl; > > foreach(gl, groupClause) > { > GroupClause *grpcl = (GroupClause *) lfirst(gl); > ! TargetEntry *te = nth(grpColIdx[keyno] - 1, sort_tlist); > Resdom *resdom = te->resdom; > > /* > --- 1498,1511 ---- > * are just dummies with no extra execution cost.) > */ > List *sort_tlist = new_unsorted_tlist(subplan->targetlist); > + int grpno = 0; > int keyno = 0; > List *gl; > > foreach(gl, groupClause) > { > GroupClause *grpcl = (GroupClause *) lfirst(gl); > ! TargetEntry *te = nth(grpColIdx[grpno] - 1, sort_tlist); > Resdom *resdom = te->resdom; > > /* > *************** > *** 1518,1523 **** > --- 1519,1525 ---- > resdom->reskey = ++keyno; > resdom->reskeyop = grpcl->sortop; > } > + grpno++; > } > > Assert(keyno > 0);