Re: Non Matching Records in Two Tables
| От | Ken Hill | 
|---|---|
| Тема | Re: Non Matching Records in Two Tables | 
| Дата | |
| Msg-id | 1139436823.11150.3.camel@localhost.localdomain обсуждение исходный текст | 
| Ответ на | Re: Non Matching Records in Two Tables (Frank Bax <fbax@sympatico.ca>) | 
| Список | pgsql-sql | 
On Wed, 2006-02-08 at 16:27 -0500, Frank Bax wrote: <blockquote type="CITE"><pre>
<font color="#000000">At 04:10 PM 2/8/06, Ken Hill wrote:</font>
<font color="#000000">>I need some help with a bit of SQL. I have two tables. I want to find </font>
<font color="#000000">>records in one table that don't match records in another table based on a </font>
<font color="#000000">>common column in the two tables. Both tables have a column named 'key100'. </font>
<font color="#000000">>I was trying something like:</font>
<font color="#000000">></font>
<font color="#000000">>SELECT count(*)</font>
<font color="#000000">>FROM table1, table2</font>
<font color="#000000">>WHERE (table1.key100 != table2.key100);</font>
<font color="#000000">></font>
<font color="#000000">>But the query is very slow and I finally just cancel it. Any help is very </font>
<font color="#000000">>much appreciated.</font>
<font color="#000000">vacuum analyse table1;</font>
<font color="#000000">vacuum analyse table2;</font>
<font color="#000000">select count(*) from table1 full outer join table2 on </font>
<font color="#000000">table1.key100=table2.key100 where table1.key100 is null or table2.key100 is </font>
<font color="#000000">null;</font>
<font color="#000000">If this is also slow, post output of "EXPLAIN ANALYSE SELECT ...." </font>
<font color="#000000">---------------------------(end of broadcast)---------------------------</font>
<font color="#000000">TIP 1: if posting/reading through Usenet, please send an appropriate</font>
<font color="#000000">       subscribe-nomail command to <a
href="mailto:majordomo@postgresql.org">majordomo@postgresql.org</a>so that your</font>
 
<font color="#000000">       message can get through to the mailing list c</font>leanly
</pre></blockquote><br /> OK. I added indexes on the two columns in the two tables:<br /><br /> CREATE INDEX
key100_idex<br/> ON ncccr9 (key100);<br /><br /> CREATE INDEX key100_ncccr10_idex<br /> ON ncccr10 (key100);<br /><br
/>Here is the analysis of the query:<br /><br /> csalgorithm=# EXPLAIN ANALYSE SELECT count(*)<br /> csalgorithm-#  
FROMncccr9 LEFT JOIN ncccr10 ON ncccr9.key100 = ncccr10.key100<br /> csalgorithm-#   WHERE ncccr10.key100 IS NULL;<br
/>                                                               QUERY PLAN <br />
--------------------------------------------------------------------------------
--------------------------------------------------------<br/> Aggregate  (cost=208337.59..208337.59 rows=1 width=0)
(actualtime=255723.212.. 255723.214 rows=1 loops=1)<br />    ->  Hash Left Join  (cost=99523.55..207101.41
rows=494471width=0) (actual ti me=92326.635..255538.447 rows=38144 loops=1)<br />          Hash Cond: ("outer".key100 =
"inner".key100)<br/>          Filter: ("inner".key100 IS NULL)<br />          ->  Seq Scan on ncccr9 
(cost=0.00..59360.71rows=494471 width=104) (ac tual time=171.778..75099.734 rows=494471 loops=1)<br />          -> 
Hash (cost=88438.64..88438.64 rows=611564 width=104) (actual time= 91962.956..91962.956 rows=0 loops=1)<br />
              ->  Seq Scan on ncccr10  (cost=0.00..88438.64 rows=611564 width=1 04) (actual time=11.704..76519.323
rows=611564loops=1)<br /> Total runtime: 255724.219 ms<br /> (8 rows)<br /><br /> The result of 38,144 non-matching
recordsseems too much:<br /><br /> csalgorithm=# SELECT count(*)<br /> csalgorithm-#   FROM ncccr9 LEFT JOIN ncccr10 ON
ncccr9.key100= ncccr10.key100<br /> csalgorithm-#   WHERE ncccr10.key100 IS NULL;<br /> count<br /> -------<br />
38144<br/> (1 row)<br /><br /> Maybe I need to do a RIGHT JOIN to return the count of records in table 'ncccr10' that
don'tmatch records in 'ncccr9'? <br /><br /> Thanks for your help. JOINS are fairly new to me.<br /><br /> 
		
	В списке pgsql-sql по дате отправления: