Обсуждение: is a 'pairwise' possible / feasible in SQL?
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi, I have a table of the form
aid cid
- ---- -----
1 123
2 456
3 667
3 879
3 123
4 878
4 456
4 123
5 999
5 667
5 879
My goal is to identify for each pair of cid values, the number of
times they have the same aid
Thus for example I would have
pair count
- ---- -----
123 & 456 1
667 & 879 2
...
I currently do this by using a Python script to do a pairwise lookup, as
select count(aid) where cid = 123 and cid = 456;
but I was wondering whether I could construct a single SQL statement
to do this.
Any pointers would be appreciated,
Thanks,
- -------------------------------------------------------------------
Rajarshi Guha <rguha@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84
- -------------------------------------------------------------------
All great discoveries are made by mistake.
-- Young
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
iEYEARECAAYFAkiXRYUACgkQZqGSLFHnnoTJJQCgtvromGcYfQVGsekGFQJU6vTo
oHgAnjpfKSkZR0MqBjdE6WFGO8SBr2WH
=zZJk
-----END PGP SIGNATURE-----
On Mon, Aug 4, 2008 at 2:08 PM, Rajarshi Guha <rguha@indiana.edu> wrote: > select count(aid) where cid = 123 and cid = 456; > > but I was wondering whether I could construct a single SQL statement to do > this. > > Any pointers would be appreciated, Typed into gmail, so may need some tweaking, but something to the effect of: select count(*) from table a inner join table b on a.aid=b.bid group by a.cid,b.cid; should do the trick, I'd think... -- - David T. Wilson david.t.wilson@gmail.com
On Mon, Aug 4, 2008 at 1:02 PM, David Wilson <david.t.wilson@gmail.com> wrote: > On Mon, Aug 4, 2008 at 2:08 PM, Rajarshi Guha <rguha@indiana.edu> wrote: > >> select count(aid) where cid = 123 and cid = 456; >> >> but I was wondering whether I could construct a single SQL statement to do >> this. >> >> Any pointers would be appreciated, > > Typed into gmail, so may need some tweaking, but something to the effect of: > > select count(*) from table a inner join table b on a.aid=b.bid group > by a.cid,b.cid; > > should do the trick, I'd think... But then you need remove the dups where you got: 667 999 2 999 667 2
On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote: > pair count > - ---- ----- > 123 & 456 1 > 667 & 879 2 create temp table aic_cid ( id smallint, cid smallint ); insert into aic_cid values (1,123); insert into aic_cid values (2,456); insert into aic_cid values (3,667); insert into aic_cid values (3,879); insert into aic_cid values (3,123); insert into aic_cid values (4,878); insert into aic_cid values (4,456); insert into aic_cid values (4,123); insert into aic_cid values (5,999); insert into aic_cid values (5,667); insert into aic_cid values (5,879); select a.cid as ac, b.cid as bc, count(*) from aic_cid a left outer join aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null group by a.cid, b.cid order by a.cid; ac | bc | count -----+-----+------- 123 | 456 | 1 123 | 667 | 1 123 | 878 | 1 123 | 879 | 1 456 | 123 | 1 456 | 878 | 1 667 | 123 | 1 667 | 879 | 2 667 | 999 | 1 878 | 123 | 1 878 | 456 | 1 879 | 123 | 1 879 | 667 | 2 879 | 999 | 1 999 | 667 | 1 999 | 879 | 1 Is that what you are looking for?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote: > On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote: >> pair count >> - ---- ----- >> 123 & 456 1 >> 667 & 879 2 > <snip> > select a.cid as ac, b.cid as bc, count(*) from aic_cid a left > outer join > aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null > group by > a.cid, b.cid order by a.cid; > ac | bc | count > -----+-----+------- > 123 | 456 | 1 > 123 | 667 | 1 > 123 | 878 | 1 > 123 | 879 | 1 > 456 | 123 | 1 > 456 | 878 | 1 > 667 | 123 | 1 > 667 | 879 | 2 > 667 | 999 | 1 > 878 | 123 | 1 > 878 | 456 | 1 > 879 | 123 | 1 > 879 | 667 | 2 > 879 | 999 | 1 > 999 | 667 | 1 > 999 | 879 | 1 > > Is that what you are looking for? Thanks a lot - this is very close. Ideally, I'd want unique pairs, so the row 879 | 999 | 1 is the same as 999 | 879 | 1 Can these duplicates be avoided? - ------------------------------------------------------------------- Rajarshi Guha <rguha@indiana.edu> GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 - ------------------------------------------------------------------- How I wish I were what I was when I wished I were what I am. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkiXbe8ACgkQZqGSLFHnnoRXPACeMcPqXG4QIf308ufnAHev9hlG EEoAoLzU5tmL1ipiUIp69N9mOvnsfrES =JOg1 -----END PGP SIGNATURE-----
On mán, 2008-08-04 at 17:00 -0400, Rajarshi Guha wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote:
> > On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote:
> >> pair count
> >> - ---- -----
> >> 123 & 456 1
> >> 667 & 879 2
> >
> <snip>
>
> > select a.cid as ac, b.cid as bc, count(*) from aic_cid a left
> > outer join
> > aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null
> > group by
> > a.cid, b.cid order by a.cid;
> > ac | bc | count
> > -----+-----+-------
> > 123 | 456 | 1
> > 123 | 667 | 1
> > ...
> > Is that what you are looking for?
>
> Thanks a lot - this is very close. Ideally, I'd want unique pairs, so
> the row
>
> 879 | 999 | 1
>
> is the same as
>
> 999 | 879 | 1
>
> Can these duplicates be avoided?
just add a ac<bc condition:
select a.cid as ac, b.cid as bc, count(*)
from aic_cid a left outer join aic_cid b
on a.cid <> b.cid and a.id = b.id
where b.cid is not null AND a.cid < b.cid
group by a.cid, b.cid
order by a.cid;
gnari
Rajarshi Guha wrote > > On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote: >> On 2:08 pm 08/04/08 Rajarshi Guha <rguha@indiana.edu> wrote: >>> pair count >>> - ---- ----- >>> 123 & 456 1 >>> 667 & 879 2 > > <snip> > >> select a.cid as ac, b.cid as bc, count(*) from aic_cid a left outer join >> aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null >> group by >> a.cid, b.cid order by a.cid; >> ac | bc | count >> -----+-----+------- >> 123 | 456 | 1 >> 123 | 667 | 1 >> 123 | 878 | 1 >> 123 | 879 | 1 >> 456 | 123 | 1 >> 456 | 878 | 1 >> 667 | 123 | 1 >> 667 | 879 | 2 >> 667 | 999 | 1 >> 878 | 123 | 1 >> 878 | 456 | 1 >> 879 | 123 | 1 >> 879 | 667 | 2 >> 879 | 999 | 1 >> 999 | 667 | 1 >> 999 | 879 | 1 > >> Is that what you are looking for? > > Thanks a lot - this is very close. Ideally, I'd want unique pairs, so > the row > > 879 | 999 | 1 > > is the same as > > 999 | 879 | 1 > > Can these duplicates be avoided? Depends on values and other distinguishing attributes.... For the given example - assuming pairing of a given cid with itself is not to be expected: add a "and a.cid < b.cid" to the query.... Rainer > > ------------------------------------------------------------------- > Rajarshi Guha <rguha@indiana.edu> > GPG Fingerprint: D070 5427 CC5B 7938 929C DD13 66A1 922C 51E7 9E84 > ------------------------------------------------------------------- > How I wish I were what I was when I wished I were what I am. > >
On Mon, Aug 04, 2008 at 05:00:31PM -0400, Rajarshi Guha wrote: > On Aug 4, 2008, at 4:55 PM, Francisco Reyes wrote: > > select a.cid as ac, b.cid as bc, count(*) from aic_cid a left > >outer join > >aic_cid b on a.cid <>b.cid and a.id = b.id where b.cid is not null > >group by > >a.cid, b.cid order by a.cid; > > > >Is that what you are looking for? > > Thanks a lot - this is very close. Ideally, I'd want unique pairs You just need to change the "a.cid <> b.cid" equality to something non-symmetric, i.e. "a.cid < b.cid". I'm also not sure why an outer join is being used. I've rewritten it to: SELECT a.cid AS ac, b.cid AS bc, count(*) FROM aic_cid a, aic_cid b WHERE a.id = b.id AND a.cid < b.cid GROUP BY a.cid, b.cid ORDER BY a.cid, b.cid; and seem to get similar results. Sam