Exclusion List

Поиск
Список
Период
Сортировка
От Michael Richards
Тема Exclusion List
Дата
Msg-id 3B68202C.0001A3.93146@frodo.searchcanada.ca
обсуждение исходный текст
Ответы Re: Exclusion List
Список pgsql-sql
I've got 2 tables, one with a list of users (has only about 5000 
entries) and another with a list of userids that have already been 
sent letters. I'm trying to efficiently join these two so I get every 
user who hasn't been sent a letter. The problem is, coupled with the 
5 other joins on the user table the query can take about 5 minutes to 
run. It's performing the joins and then reducing the list. 

The reduction of the list doesn't seem to be terribly efficient. Here 
are some strategies I've been looking at:

select id from users WHERE  id not in (select userid from sentletters where lettertype=1) AND aclgroup IN (1,2);
              
 

Seq Scan on users  (cost=0.00..217751.39 rows=5369 width=4) SubPlan   ->  Seq Scan on sentletters  (cost=0.00..81.06
rows=4405width=4)
 

select id from users WHERE not exists  (select userid from sentletters where lettertype=1 AND userid=id)  AND aclgroup
IN(1,2);
 

Seq Scan on users  (cost=0.00..10980.07 rows=1 width=4) SubPlan   ->  Index Scan using sentletters_userid_key on
sentletters 
 
(cost=0.00..2.02 rows=1 width=4)

select id from users AS u  LEFT JOIN sentletters AS sl ON (lettertype=1 AND sl.userid=u.id) 
where sl.userid IS NULL AND u.aclgroup IN (1,2);

Hash Join  (cost=81.06..550.18 rows=5322 width=12) ->  Seq Scan on users u  (cost=0.00..152.53 rows=5322 width=4) ->
Hash (cost=70.05..70.05 rows=4405 width=8)       ->  Seq Scan on sentletters sl  (cost=0.00..70.05 rows=4405 
 
width=8)

All but the last which is an odd way to do it have nasty query plan 
and this isn't even talking about the joins.

I then need to join these results with a table that has about 200,000 
rows and is joined on the userid and some other key elements.

Any suggestions on this? The full query causing the problem is:
select u.id,u.firstname,u.surname,f2.strval,f3.strval,f4.strval,      f5.strval,u2.firstname,u2.surname,m.strval 
from  users as u  JOIN dft_formdata as f1 ON    (u.id=f1.userid AND u.aclgroup IN (1,2) AND f1.formid=1 AND
f1.fieldid=1) LEFT JOIN dft_formdata as f2 ON    (u.id=f2.userid AND f2.formid=1 AND f2.fieldid=2)  JOIN dft_formdata
asf3 on    (u.id=f3.userid AND f3.formid=1 AND f3.fieldid=3)  JOIN dft_formdata as f4 on    (u.id=f4.userid AND
f4.formid=1AND f4.fieldid=5)  JOIN relations as r on    (u.id=r.relatedto AND r.type=2)  JOIN users as u2 on
(u2.id=r.useridAND u2.aclgroup=200)  JOIN dft_formdata as f5 on    (u.id=f5.userid AND f5.formid=1 AND f5.fieldid=4)
JOINdft_formmvl as m on    (m.id=f5.intval and m.mvlid=1)  JOIN ft_formdata as f6 on    (u.id=f6.userid AND f6.formid=1
ANDf6.fieldid=155 AND     f6.intval=3)  WHERE not exists    (select userid from sentletters WHERE userid=u.id);


Nested Loop  (cost=0.00..11280.10 rows=1 width=164)->  Nested Loop  (cost=0.00..11276.36 rows=1 width=160) ->  Nested
Loop (cost=0.00..11274.33 rows=1 width=144)  ->  Nested Loop  (cost=0.00..11270.59 rows=1 width=124)   ->  Nested Loop
(cost=0.00..11268.56rows=1 width=96)    ->  Nested Loop  (cost=0.00..10981.55 rows=1 width=88)     ->  Nested Loop
(cost=0.00..10977.82rows=1 width=72)      ->  Nested Loop  (cost=0.00..10974.10 rows=1 width=56)       ->  Nested Loop
(cost=0.00..10970.37rows=1 width=32)        ->  Seq Scan on users u  (cost=0.00..10966.65 rows=1 
 
width=28)         SubPlan          ->  Index Scan using sentletters_userid_key on 
sentletters  (cost=0.00..2.01 rows=1 width=4)        ->  Index Scan using dft_formdata_userid_field on 
dft_formdata f1  (cost=0.00..3.71 rows=1 width=4)       ->  Index Scan using dft_formdata_userid_field on 
dft_formdata f2  (cost=0.00..3.71 rows=1 width=24)      ->  Index Scan using dft_formdata_userid_field on dft_formdata

f3  (cost=0.00..3.71 rows=1 width=16)     ->  Index Scan using dft_formdata_userid_field on dft_formdata 
f4  (cost=0.00..3.71 rows=1 width=16)    ->  Seq Scan on relations r  (cost=0.00..185.43 rows=5079 
width=8)   ->  Index Scan using users_pkey on users u2  (cost=0.00..2.02 
rows=1 width=28)  ->  Index Scan using dft_formdata_userid_field on dft_formdata f5  
(cost=0.00..3.71 rows=1 width=20) ->  Index Scan using dft_formmvl_pkey on dft_formmvl m  
(cost=0.00..2.02 rows=1 width=16)->  Index Scan using dft_formdata_userid_field on dft_formdata f6  
(cost=0.00..3.71 rows=1 width=4)


Yes I know the query itself is really nasty but I think 5 minutes is 
a little excessive.

-Michael
_________________________________________________________________    http://fastmail.ca/ - Fast Free Web Email for
Canadians

В списке pgsql-sql по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Converting epoch to timestamp?
Следующее
От: "Josh Berkus"
Дата:
Сообщение: Re: Converting epoch to timestamp?