Обсуждение: Selecting records not present in related tables

Поиск
Список
Период
Сортировка

Selecting records not present in related tables

От
Hector Rosas
Дата:
Hello, I'm trying to select records in a table not present in a related table, in example, I've a table with message
information(subject, message, date, etc) and another (usermessages) with where user(s) has that message, its state,
etc.Records in this table will be deleted in a certain time (just some extra info). <br />I want to select messages
recordsthat aren't present in the other table (usermessages), I got the next two queries, maybe someone can suggest a
betterone.<br /><br /><span style="font-family: courier new,monospace;">SELECT <a href="http://m.id">m.id</a> FROM
messagesAS m</span><br style="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;">WHERE(SELECT count(um.*) FROM usermessages AS um WHERE um.idmessage=m.id )=0;</span><br /><br /><span
style="font-family:courier new,monospace;">SELECT <a href="http://m.id">m.id</a> FROM messages AS m where id NOT IN
(selectum.idmessage FROM usermessages um);</span><br /><br />Both queries work, but doing a <span style="font-family:
couriernew,monospace;">EXPLAIN ANALYZE</span> I got the next results.<br /><br /><span style="font-family: courier
new,monospace;">bd=#explain analyze SELECT <a href="http://m.id">m.id</a> FROM messages AS m <br />bd-# WHERE (SELECT
count(um.*)FROM usermessages AS um WHERE um.idmessage=m.id)=0;</span><br style="font-family: courier new,monospace;"
/><spanstyle="font-family: courier new,monospace;">                                                                   
QUERY
PLAN                                                                                                                                       
</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;">---------------------------------------------------------------------------------------------------------------------------------------------------
</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> Seq Scan on
messagesm  (cost=0.00..3915.75 rows=3 width=4) (actual time=40.531..40.531 rows=0 loops=1)</span><br
style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">   Filter: ((subplan) =
0)</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">  
SubPlan</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">    
-> Aggregate  (cost=9.11..9.11 rows=1 width=4) (actual time=0.098..0.104 rows=1 loops=355)</span><br
style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">            ->  Index
Scanusing message_selection on usermessages um  (cost=0.00..9.10 rows=3 width=4) (actual time=0.067..0.078 rows=1
loops=355)</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">
                Index Cond: (idmessage = $0)</span><br style="font-family: courier new,monospace;" /><span
style="font-family:courier new,monospace;"> Total runtime: 40.605 ms</span><br style="font-family: courier
new,monospace;"/><span style="font-family: courier new,monospace;">(7 rows)</span><br style="font-family: courier
new,monospace;"/><span style="font-family: courier new,monospace;"> </span><br style="font-family: courier
new,monospace;"/><span style="font-family: courier new,monospace;"> bd=# explain analyze select <a
href="http://m.id">m.id</a>FROM messages AS m where id NOT IN (select um.idmessage FROM usermessages um);</span><br
style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">
                                                     QUERY PLAN</span><br style="font-family: courier new,monospace;"
/><spanstyle="font-family: courier
new,monospace;">----------------------------------------------------------------------------------------------------------------------
</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;"> Seq Scan on
messagesm  (cost=9.68..43.00 rows=213 width=4) (actual time=20.329..20.329 rows=0 loops=1)</span><br
style="font-family:courier new,monospace;" /><span style="font-family: courier new,monospace;">   Filter: (NOT (hashed
subplan))</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">  
SubPlan</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier new,monospace;">    
-> Seq Scan on usermessages um  (cost=0.00..8.54 rows=454 width=4) (actual time=0.008..13.094 rows=454
loops=1)</span><brstyle="font-family: courier new,monospace;" /><span style="font-family: courier
new,monospace;"> Totalruntime: 20.386 ms</span><br style="font-family: courier new,monospace;" /><span
style="font-family:courier new,monospace;">(5 rows)</span><br style="font-family: courier new,monospace;" /><br />In
firstquery, cost can be between 0 and almost 4 sec, and also I see that loops value, that I don't know what performance
issuescould arise.<br />In second query, I see a seq scan, which I don't like, I think that with too many records this
querycould take ages, or maybe not, but loops value is 1. <br /><br />I hope someone can give some advice with those
queries, or maybe a better query. I've not decided which query I'm going to use, thanks!<br /><br />Jeziel.<br /> 

Re: Selecting records not present in related tables

От
Frank Bax
Дата:
At 03:43 PM 10/6/05, Hector Rosas wrote:

>Hello, I'm trying to select records in a table not present in a related 
>table, in example, I've a table with message information (subject, 
>message, date, etc) and another (usermessages) with where user(s) has that 
>message, its state, etc. Records in this table will be deleted in a 
>certain time (just some extra info).
>I want to select messages records that aren't present in the other table 
>(usermessages), I got the next two queries, maybe someone can suggest a 
>better one.
>
>SELECT <http://m.id>m.id FROM messages AS m
>WHERE (SELECT count(um.*) FROM usermessages AS um WHERE um.idmessage=m.id )=0;
>
>SELECT <http://m.id>m.id FROM messages AS m where id NOT IN (select 
>um.idmessage FROM usermessages um);


select m.id from messages as m left join usermessages as um on 
m.id=um.idmessage where um.idmessage is null;




Re: Selecting records not present in related tables

От
Scott Marlowe
Дата:
On Thu, 2005-10-06 at 14:43, Hector Rosas wrote:
> Hello, I'm trying to select records in a table not present in a
> related table, in example, I've a table with message information
> (subject, message, date, etc) and another (usermessages) with where
> user(s) has that message, its state, etc. Records in this table will
> be deleted in a certain time (just some extra info). 
> I want to select messages records that aren't present in the other
> table (usermessages), I got the next two queries, maybe someone can
> suggest a better one.

A fairly common way to do this is to use a left join and a not null:

select a.id  from tablea a left join tableb b on (a.id=b._aid) where
b._aid IS NULL


Re: Selecting records not present in related tables

От
"Anthony Molinaro"
Дата:

Jeziel,

  there are a couple techniques you can try, two I like are set difference and anti-joins.

 

here’s the set diff:

 

select id

  from messages

except

select id

  from usermessages

 

that will returns all id from messages not in usermessages

 

if ID is indexed on both tables, you may wanna try an anti join:

 

select m.id

  from messages m

       left join

       usermessages um

    on ( m.id = um.id )

 where um.id is null

 

 

both techniques can be visciously efficient.

 

good luck,

  Anthony  

          

 

-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Hector Rosas
Sent: Thursday, October 06, 2005 3:44 PM
To: pgsql-sql@postgresql.org
Subject: [SQL] Selecting records not present in related tables

 

Hello, I'm trying to select records in a table not present in a related table, in example, I've a table with message information (subject, message, date, etc) and another (usermessages) with where user(s) has that message, its state, etc. Records in this table will be deleted in a certain time (just some extra info).
I want to select messages records that aren't present in the other table (usermessages), I got the next two queries, maybe someone can suggest a better one.

SELECT m.id FROM messages AS m
WHERE (SELECT count(um.*) FROM usermessages AS um WHERE um.idmessage=m.id )=0;


SELECT m.id FROM messages AS m where id NOT IN (select um.idmessage FROM usermessages um);

Both queries work, but doing a EXPLAIN ANALYZE I got the next results.

bd=# explain analyze SELECT m.id FROM messages AS m
bd-# WHERE (SELECT count(um.*) FROM usermessages AS um WHERE um.idmessage=m.id)=0;
                                                                    QUERY PLAN                                                                                                                                       
---------------------------------------------------------------------------------------------------------------------------------------------------
 Seq Scan on messages m  (cost=0.00..3915.75 rows=3 width=4) (actual time=40.531..40.531 rows=0 loops=1)
   Filter: ((subplan) = 0)
   SubPlan
     ->  Aggregate  (cost=9.11..9.11 rows=1 width=4) (actual time=0.098..0.104 rows=1 loops=355)
           ->  Index Scan using message_selection on usermessages um  (cost=0.00..9.10 rows=3 width=4) (actual time=0.067..0.078 rows=1 loops=355)
                 Index Cond: (idmessage = $0)
 Total runtime: 40.605 ms
(7 rows)
 
bd=# explain analyze select m.id FROM messages AS m where id NOT IN (select um.idmessage FROM usermessages um);
                                                      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
 Seq Scan on messages m  (cost=9.68..43.00 rows=213 width=4) (actual time=20.329..20.329 rows=0 loops=1)
   Filter: (NOT (hashed subplan))
   SubPlan
     ->  Seq Scan on usermessages um  (cost=0.00..8.54 rows=454 width=4) (actual time=0.008..13.094 rows=454 loops=1)
 Total runtime: 20.386 ms
(5 rows)

In first query, cost can be between 0 and almost 4 sec, and also I see that loops value, that I don't know what performance issues could arise.
In second query, I see a seq scan, which I don't like, I think that with too many records this query could take ages, or maybe not, but loops value is 1.

I hope someone can give some advice with those queries , or maybe a better query. I've not decided which query I'm going to use, thanks!

Jeziel.