Обсуждение: Matching several rows
Hi,<br /><br />This is possibly absolutely trivial but I am lost...<br /><br />A table URIGHTS which stores an ID and theRIGHTs this ID has. One ID may have many rights and accordingly records in table, sample:<br /><br />ID RIGHT <br/>---------------------<br />20 1<br />20 2<br />20 5<br />20 10<br />30 2<br />30 10<br /><br />Now I need to find out which IDs have, say rights 2 AND 5 AND 10.<br /><br />What would be the simplestquery? <br /><br />Thanks!
On Jan 18, 2006, at 13:43 , Ivan Steganov wrote:
> A table URIGHTS which stores an ID and the RIGHTs this ID has. One  
> ID may have many rights and accordingly records in table, sample:
> Now I need to find out which IDs have, say rights 2 AND 5 AND 10.
select "ID"
from "URIGHTS" as right_2
join "URIGHTS" as right_5 using ("ID")
join "URIGHTS" as right_10 using ("ID")
where right_2."RIGHT" = 2and right_5."RIGHT" = 5and right_10."RIGHT" = 10
 or
select "ID"
from (select "ID"from "URIGHTS"where "RIGHT" = 2) as right_2
join (select "ID"from "URIGHTS"where "RIGHT" = 5) as right_5 using ("ID")
join (select "ID"from "URIGHTS"where "RIGHT" = 10) as right_10 using ("ID")
Simple is in the eye of the beholder. You might want to compare the  
EXPLAIN ANALYZE output to see if there are any significant  
differences between these queries.
Michael Glaesemann
grzm myrealbox com
			
		Hi, On Jan 18 05:43, Ivan Steganov wrote: > ID RIGHT > --------------------- > 20 1 > 20 2 > 20 5 > 20 10 > 30 2 > 30 10 > > Now I need to find out which IDs have, say rights 2 AND 5 AND 10. SELECT t.id FROM (SELECT id, sum(1) AS s FROM id_n_rights WHERE rights = ANY(ARRAY[2,5,10]) GROUP BY id) AS t WHERE t.s = 3; -- 3 = length(ARRAY[2,5,10]) HTH. Regards.
On Jan 18, 2006, at 20:55 , Volkan YAZICI wrote:
> On Jan 18 05:43, Ivan Steganov wrote:
>> ID        RIGHT
>> ---------------------
>> 20        1
>> 20        2
>> 20        5
>> 20        10
>> 30        2
>> 30        10
>>
>> Now I need to find out which IDs have, say rights 2 AND 5 AND 10.
>
> SELECT t.id
> FROM (SELECT id, sum(1) AS s
>       FROM id_n_rights
>       WHERE rights = ANY(ARRAY[2,5,10])
>       GROUP BY id) AS t
> WHERE t.s = 3;        -- 3 = length(ARRAY[2,5,10])
Or how about just
select id
from urights
where uright in (2,5,10)
group by id
having count(id) = 3; -- number of urights in IN clause
Find below the query plans for the variations we've seen (as well as  
DDL statements):
Michael Glaesemann
grzm myrealbox com
create table urights
(    id integer not null    , uright integer not null
);
copy urights (id, uright) from stdin;
20    1
20    2
20    5
20    10
30    2
30    10
\.
select id
from urights as right_2
join urights as right_5 using (id)
join urights as right_10 using (id)
where right_2.uright = 2    and right_5.uright = 5    and right_10.uright = 10;
           QUERY PLAN
 
------------------------------------------------------------------------ 
---------------------------------------------
Nested Loop  (cost=0.00..3.27 rows=1 width=4) (actual  
time=0.053..0.069 rows=1 loops=1)   Join Filter: ("outer".id = "inner".id)   ->  Nested Loop  (cost=0.00..2.17 rows=1
width=8)(actual  
 
time=0.041..0.052 rows=1 loops=1)         Join Filter: ("inner".id = "outer".id)         ->  Seq Scan on urights
right_5 (cost=0.00..1.07 rows=1  
 
width=4) (actual time=0.023..0.027 rows=1 loops=1)               Filter: (uright = 5)         ->  Seq Scan on urights
right_2 (cost=0.00..1.07 rows=2  
 
width=4) (actual time=0.005..0.011 rows=2 loops=1)               Filter: (uright = 2)   ->  Seq Scan on urights
right_10 (cost=0.00..1.07 rows=2  
 
width=4) (actual time=0.007..0.011 rows=2 loops=1)         Filter: (uright = 10)
Total runtime: 0.337 ms
(11 rows)
select id
from (    select id    from urights    where uright = 2    ) as right_2
join (    select id    from urights    where uright = 5    ) as right_5 using (id)
join (    select id    from urights    where uright = 10    ) as right_10 using (id);
             QUERY PLAN
 
------------------------------------------------------------------------ 
-------------------------------------
Nested Loop  (cost=0.00..3.27 rows=1 width=4) (actual  
time=0.062..0.079 rows=1 loops=1)   Join Filter: ("outer".id = "inner".id)   ->  Nested Loop  (cost=0.00..2.17 rows=1
width=8)(actual  
 
time=0.048..0.060 rows=1 loops=1)         Join Filter: ("inner".id = "outer".id)         ->  Seq Scan on urights
(cost=0.00..1.07rows=1 width=4)  
 
(actual time=0.031..0.036 rows=1 loops=1)               Filter: (uright = 5)         ->  Seq Scan on urights
(cost=0.00..1.07rows=2 width=4)  
 
(actual time=0.005..0.011 rows=2 loops=1)               Filter: (uright = 2)   ->  Seq Scan on urights
(cost=0.00..1.07rows=2 width=4) (actual  
 
time=0.006..0.011 rows=2 loops=1)         Filter: (uright = 10)
Total runtime: 0.299 ms
(11 rows)
SELECT t.id
FROM (SELECT id, sum(1) AS s    FROM urights    WHERE uright = ANY(ARRAY[2,5,10])    GROUP BY id) AS t
WHERE t.s = 3;        -- 3 = length(ARRAY[2,5,10])                                                 QUERY PLAN
------------------------------------------------------------------------ 
-------------------------------------
Subquery Scan t  (cost=1.23..1.26 rows=1 width=4) (actual  
time=0.102..0.106 rows=1 loops=1)   ->  HashAggregate  (cost=1.23..1.25 rows=1 width=4) (actual  
time=0.095..0.098 rows=1 loops=1)         Filter: (sum(1) = 3)         ->  Seq Scan on urights  (cost=0.00..1.21 rows=3
width=4) 
 
(actual time=0.038..0.052 rows=5 loops=1)               Filter: (uright = ANY ('{2,5,10}'::integer[]))
Total runtime: 2.521 ms
(6 rows)
select id
from urights
where uright in (2,5,10)
group by id
having count(id) = 3;                                              QUERY PLAN
------------------------------------------------------------------------ 
-------------------------------
HashAggregate  (cost=1.12..1.15 rows=2 width=4) (actual  
time=2.582..2.584 rows=1 loops=1)   Filter: (count(id) = 3)   ->  Seq Scan on urights  (cost=0.00..1.10 rows=4 width=4)
(actual 
 
time=0.042..2.502 rows=5 loops=1)         Filter: ((uright = 2) OR (uright = 5) OR (uright = 10))
Total runtime: 2.944 ms
(5 rows)
			
		On Jan 18 09:33, Michael Glaesemann wrote: > On Jan 18, 2006, at 20:55 , Volkan YAZICI wrote: > >SELECT t.id > >FROM (SELECT id, sum(1) AS s > > FROM id_n_rights > > WHERE rights = ANY(ARRAY[2,5,10]) > > GROUP BY id) AS t > >WHERE t.s = 3; AFAICS, the bottleneck in above query is ANY(ARRAY[]) clause usage. Instead of that, if you replace «rights = ANY(ARRAY[2,5,10])» with «rights IN (2,5,10)» it's overhead decreases to 0.200-0.300ms domain. Regards.
On Jan 18, 2006, at 21:48 , Volkan YAZICI wrote: > AFAICS, the bottleneck in above query is ANY(ARRAY[]) clause usage. > Instead of that, if you replace «rights = ANY(ARRAY[2,5,10])» with > «rights IN (2,5,10)» it's overhead decreases to 0.200-0.300ms domain. explain analyze SELECT id FROM ( SELECT id, sum(1) AS s FROM urights WHERE uright in (2,5,10) GROUP BY id) AS t WHERE s = 3; QUERY PLAN ------------------------------------------------------------------------ ------------------------------------- Subquery Scan t (cost=1.14..1.19 rows=2 width=4) (actual time=0.106..0.108 rows=1 loops=1) -> HashAggregate (cost=1.14..1.17 rows=2 width=4) (actual time=0.103..0.105 rows=1 loops=1) Filter: (sum(1) = 3) -> Seq Scan on urights (cost=0.00..1.10 rows=4 width=4) (actual time=0.029..0.038 rows=5 loops=1) Filter: ((uright = 2) OR (uright = 5) OR (uright = 10)) Total runtime: 0.386 ms (6 rows) Michael Glaesemann grzm myrealbox com
Thank you to everyone for the great help!<br /><br />I will evaluate all methods in our query (It is actually well complexerthen this sample) and choose the best one.<br /><br />Is there any "scientific" name to this kind of "several rowsmatch for one result" data selection? <br /><br />Ivan
Michael Glaesemann wrote: > > On Jan 18, 2006, at 21:48 , Volkan YAZICI wrote: > >> AFAICS, the bottleneck in above query is ANY(ARRAY[]) clause usage. >> Instead of that, if you replace «rights = ANY(ARRAY[2,5,10])» with >> «rights IN (2,5,10)» it's overhead decreases to 0.200-0.300ms domain. > > > explain analyze > SELECT id > FROM ( > SELECT id, sum(1) AS s > FROM urights > WHERE uright in (2,5,10) > GROUP BY id) AS t > WHERE s = 3; > QUERY PLAN > ------------------------------------------------------------------------ > ------------------------------------- > Subquery Scan t (cost=1.14..1.19 rows=2 width=4) (actual > time=0.106..0.108 rows=1 loops=1) > -> HashAggregate (cost=1.14..1.17 rows=2 width=4) (actual > time=0.103..0.105 rows=1 loops=1) > Filter: (sum(1) = 3) > -> Seq Scan on urights (cost=0.00..1.10 rows=4 width=4) > (actual time=0.029..0.038 rows=5 loops=1) > Filter: ((uright = 2) OR (uright = 5) OR (uright = 10)) > Total runtime: 0.386 ms > (6 rows) > > > Michael Glaesemann > grzm myrealbox com > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > why not use an having clause in the GROUP BY? HTH
It is easier to think of this as SET INTERSECTION which leads to: SELECT id FROM urights WHERE right = 2 INTERSECT SELECT id FROM urights WHERE right = 5 INTERSECT SELECT id FROM urights WHERE right = 10 Ivan Steganov wrote: > Thank you to everyone for the great help! > > I will evaluate all methods in our query (It is actually well complexer then > this sample) and choose the best one. > > Is there any "scientific" name to this kind of "several rows match for one > result" data selection? > > Ivan >