Обсуждение: Top n queries and GROUP BY
All,
This is a straight SQL question, maybe not appropriate for a performance
list, but...
I have a simple stock holdings setup:
=> select * from t1;
nam | co | num
-----+-----------+------
joe | ibm | 600
abe | ibm | 1500
joe | cisco | 1200
abe | cisco | 800
joe | novell | 500
joe | microsoft | 200
What I would like to see is a Top-n-holdings-by-name", e.g, for n=2:
nam | co | num
----------+--------+-----
joe | cisco | 1200
joe | ibm | 600
abe | ibm | 1500
abe | cisco | 800
I can get part of the way by using a LIMIT clause in a subquery, e.g,
=> select 'abe', a.co, a.num from (select co, num from t1 where
nam='abe' order by num desc limit 2) as a;
?column? | co | num
----------+-------+------
abe | ibm | 1500
abe | cisco | 800
but I can't figure out a correlated subquery (or GROUP BY arrangement or
anything else) that will cycle through the names. I vaguely remember
that these kinds or queries are hard to do in standard SQL, but I was
hoping that PG, with its extensions...
Thanks, Rich Cullingford
rculling@sysd.com
Rich Cullingford wrote:
> All,
> This is a straight SQL question, maybe not appropriate for a performance
> list, but...
>
> I have a simple stock holdings setup:
>
> => select * from t1;
> nam | co | num
> -----+-----------+------
> joe | ibm | 600
> abe | ibm | 1500
> joe | cisco | 1200
> abe | cisco | 800
> joe | novell | 500
> joe | microsoft | 200
>
> What I would like to see is a Top-n-holdings-by-name", e.g, for n=2:
>
> nam | co | num
> ----------+--------+-----
> joe | cisco | 1200
> joe | ibm | 600
> abe | ibm | 1500
> abe | cisco | 800
>
> I can get part of the way by using a LIMIT clause in a subquery, e.g,
>
> => select 'abe', a.co, a.num from (select co, num from t1 where
> nam='abe' order by num desc limit 2) as a;
> ?column? | co | num
> ----------+-------+------
> abe | ibm | 1500
> abe | cisco | 800
>
> but I can't figure out a correlated subquery (or GROUP BY arrangement or
> anything else) that will cycle through the names. I vaguely remember
> that these kinds or queries are hard to do in standard SQL, but I was
> hoping that PG, with its extensions...
I forgot about row subqueries; for n=3, for example:
=> SELECT * FROM t1
WHERE (nam,co,num) IN
(SELECT nam,co,num FROM t1 b
where b.nam=t1.nam
order by num desc limit 3)
order by nam, num desc;
nam | co | num
-----+--------+------
abe | ibm | 1500
abe | cisco | 800
joe | cisco | 1200
joe | ibm | 600
joe | novell | 500
(5 rows)
Seems to work...
Thanks all, Rich Cullingford
rculling@sysd.com
In article <3FB8F98D.1010707@sysd.com>, Rich Cullingford <rculling@sysd.com> writes: > All, > This is a straight SQL question, maybe not appropriate for a performance > list, but... > I have a simple stock holdings setup: > => select * from t1; > nam | co | num > -----+-----------+------ > joe | ibm | 600 > abe | ibm | 1500 > joe | cisco | 1200 > abe | cisco | 800 > joe | novell | 500 > joe | microsoft | 200 > What I would like to see is a Top-n-holdings-by-name", e.g, for n=2: > nam | co | num > ----------+--------+----- > joe | cisco | 1200 > joe | ibm | 600 > abe | ibm | 1500 > abe | cisco | 800 > I can get part of the way by using a LIMIT clause in a subquery, e.g, > => select 'abe', a.co, a.num from (select co, num from t1 where > nam='abe' order by num desc limit 2) as a; > ?column? | co | num > ----------+-------+------ > abe | ibm | 1500 > abe | cisco | 800 > but I can't figure out a correlated subquery (or GROUP BY arrangement or > anything else) that will cycle through the names. I vaguely remember > that these kinds or queries are hard to do in standard SQL, but I was > hoping that PG, with its extensions... How about an outer join? SELECT x1.nam, x1.co, x1.num FROM t1 x1 LEFT JOIN t1 x2 ON x2.nam = x1.nam AND x2.num > x1.num GROUP BY x1.nam, x1.co, x1.num HAVING count(*) < 2 ORDER BY x1.nam, x1.num DESC