Обсуждение: counting distinct values

Поиск
Список
Период
Сортировка

counting distinct values

От
Joseph Shraibman
Дата:
Using the example from
http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I
do if I wanted to know the number of different cities where I had a
friend in each state?  select count(city) group by state; would not work
because if you had two friends in the same city it would be counted
twice.


Re: counting distinct values

От
Bruce Momjian
Дата:
> Using the example from
> http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I
> do if I wanted to know the number of different cities where I had a
> friend in each state?  select count(city) group by state; would not work
> because if you had two friends in the same city it would be counted
> twice.
> 

How about?
select city, state, count(*) group by state, city;


--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: counting distinct values

От
Joseph Shraibman
Дата:
Bruce Momjian wrote:
> 
> > Using the example from
> > http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I
> > do if I wanted to know the number of different cities where I had a
> > friend in each state?  select count(city) group by state; would not work
> > because if you had two friends in the same city it would be counted
> > twice.
> >
> 
> How about?
> 
>         select city, state, count(*) group by state, city;

OK that doesn't work for me because I want to use other agregates on the
state level.
Forgetting the state/city example.

playpen=> create table tablem ( a int, b int, n int);
CREATE
playpen=> insert into tablem (a, b, n) values (1, 2, 1);
INSERT 35197 1
playpen=> insert into tablem (a, b, n) values (1, 2, 2);
INSERT 35198 1
playpen=> insert into tablem (a, b, n) values (1, 3, 1);
INSERT 35199 1
playpen=> insert into tablem (a, b, n) values (1, 4, 3);
INSERT 35296 1
playpen=> insert into tablem (a, b, n) values (1, 1, 4);
INSERT 35297 1
playpen=> insert into tablem (a, b, n) values (2, 3, 3);
INSERT 35298 1
playpen=> insert into tablem (a, b, n) values (2, 5, 7);
INSERT 35299 1
playpen=> insert into tablem (a, b, n) values (2, 3, 3);
INSERT 35300 1
playpen=> insert into tablem (a, b, n) values (2, 3, 1);
INSERT 35301 1
playpen=> select a, count(*), avg(n), sum(n) from tablem group by a;
a|count|avg|sum
-+-----+---+---
1|    5|  2| 11
2|    4|  3| 14
(2 rows)


... now suppose I want to have the number of distictive b's in the
results as well. I try:

playpen=> select a, count(m.oid), avg(m.n), sum(m.n), (select
count(mm.oid) from tablem mm where m.a = mm.a group by mm.b ) from
tablem m group by a;
ERROR:  More than one tuple returned by a subselect used as an
expression.
playpen=> 

... even though the subselect should only return one tuple.


Re: counting distinct values

От
Tom Lane
Дата:
Joseph Shraibman <jks@selectacast.net> writes:
>>>> Using the example from
>>>> http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I
>>>> do if I wanted to know the number of different cities where I had a
>>>> friend in each state?  select count(city) group by state; would not work
>>>> because if you had two friends in the same city it would be counted
>>>> twice.

Er, what's wrong with select count(distinct city) group by state?

> ... now suppose I want to have the number of distictive b's in the
> results as well. I try:

> playpen=> select a, count(m.oid), avg(m.n), sum(m.n), (select
> count(mm.oid) from tablem mm where m.a = mm.a group by mm.b ) from
> tablem m group by a;
> ERROR:  More than one tuple returned by a subselect used as an
> expression.
> playpen=> 

> ... even though the subselect should only return one tuple.

Not unless there's only one b value for any one a value --- otherwise
the sub-select will return one row per b group.  The error message looks
correct to me.
        regards, tom lane


Re: counting distinct values

От
Joseph Shraibman
Дата:
Tom Lane wrote:
> 
> Joseph Shraibman <jks@selectacast.net> writes:
> >>>> Using the example from
> >>>> http://www.postgresql.org/docs/aw_pgsql_book/node59.html, what would I
> >>>> do if I wanted to know the number of different cities where I had a
> >>>> friend in each state?  select count(city) group by state; would not work
> >>>> because if you had two friends in the same city it would be counted
> >>>> twice.
> 
> Er, what's wrong with select count(distinct city) group by state?

I thought I tried that, but it didn't work the first time. <shrug> Guess
I mistyped something. Sorry.
> 
> > ... now suppose I want to have the number of distictive b's in the
> > results as well. I try:
> 
> > playpen=> select a, count(m.oid), avg(m.n), sum(m.n), (select
> > count(mm.oid) from tablem mm where m.a = mm.a group by mm.b ) from
> > tablem m group by a;
> > ERROR:  More than one tuple returned by a subselect used as an
> > expression.
> > playpen=>
> 
> > ... even though the subselect should only return one tuple.
> 
> Not unless there's only one b value for any one a value --- otherwise
> the sub-select will return one row per b group.  The error message looks
> correct to me.
> 
OK I thought that group by was eleminating duplicates which would then
be counted. That evolved from q simple select(*) ... group by that
worked.  Don't know where I screwed it up.