Re: which is better: using OR clauses or UNION?

Поиск
Список
Период
Сортировка
От Viktor Bojović
Тема Re: which is better: using OR clauses or UNION?
Дата
Msg-id CAJu1cLacsu9KiL62=KC4dBj9dTpJwUWoaWVk9ni1qrEjq+GEww@mail.gmail.com
обсуждение исходный текст
Ответ на which is better: using OR clauses or UNION?  (adam_pgsql <adam_pgsql@witneyweb.org>)
Ответы Re: which is better: using OR clauses or UNION?  (adam_pgsql <adam_pgsql@witneyweb.org>)
Список pgsql-sql
hi Adam,<br />im not sure which is faster/slower but, possibly you can speed it up by using "in" operator<br /><br
/>...wherelower(identifier) <b>in </b>(lower('BUGS0000001884677')<b> ,</b> lower('BUGS0000001884678')<b>,</b>....);<br
/><br/>if you create function based index:<br /><pre>CREATE INDEX idx_table_lower_text ON
table(<b>lower</b>(text_field));</pre>(takenfrom: <a
href="http://archives.postgresql.org/pgsql-sql/2003-09/msg00395.php">http://archives.postgresql.org/pgsql-sql/2003-09/msg00395.php</a>)<br
/><br/><br /><br /><br /><div class="gmail_quote">On Tue, Aug 16, 2011 at 1:39 PM, adam_pgsql <span dir="ltr"><<a
href="mailto:adam_pgsql@witneyweb.org">adam_pgsql@witneyweb.org</a>></span>wrote:<br /><blockquote
class="gmail_quote"style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><br /> Hi,<br /><br /> I have
aquery hitting a table of 25 million rows. The table has a text field ('identifier') which i need to query for matching
rows.The question is if i have multiple strings to match against this field I can use multiple OR sub-statements or
multiplestatements in a UNION. The UNION seems to run quicker.... is this to be expected? or is there anything else I
cando improve the speed of this query? Some query details:<br /><br /><br /> table "dba_data_base", index:<br />
"in_dba_data_base_identifier"btree (lower(identifier) varchar_pattern_ops)<br /><br /><br /> Query 1<br /> -------<br
/>datadb=#explain analyse SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_value<br />  FROM<br />    
dba_data_basea<br />  WHERE<br />     ( lower(identifier) LIKE lower('BUGS0000001884677') OR<br />      
lower(identifier)LIKE lower('BUGS0000001884678') OR<br />       lower(identifier) LIKE lower('BUGS0000001884679') OR<br
/>      lower(identifier) LIKE lower('SpTIGR4-2210 (6F24)') )<br />  ORDER BY a.identifier;<br />                      
                                                                                      QUERY PLAN<br /><br />
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<br
/>--------------------------------<br /> Sort  (cost=14930.13..14939.77 rows=3857 width=62) (actual
time=3208.466..3208.652rows=318 loops=1)<br />  Sort Key: identifier<br />  ->  Bitmap Heap Scan on dba_data_base a
 (cost=134.43..14700.38rows=3857 width=62) (actual time=81.106..3207.721 rows=318 loops=1)<br />        Recheck Cond:
((lower(identifier)~~ 'bugs0000001884677'::text) OR (lower(identifier) ~~ 'bugs0000001884678'::text) OR
(lower(identifier)~~ 'bugs0000001884679'::text) OR (lower(identifier) ~<br /> ~ 'sptigr4-2210 (6f24)'::text))<br />    
  Filter: ((lower(identifier) ~~ 'bugs0000001884677'::text) OR (lower(identifier) ~~ 'bugs0000001884678'::text) OR
(lower(identifier)~~ 'bugs0000001884679'::text) OR (lower(identifier) ~~ 'spt<br /> igr4-2210 (6f24)'::text))<br />    
  ->  BitmapOr  (cost=134.43..134.43 rows=3857 width=0) (actual time=<a href="tel:71.397..71.397"
value="+17139771397">71.397..71.397</a>rows=0 loops=1)<br />              ->  Bitmap Index Scan on
in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=0.029..0.029 rows=0 loops=1)<br />        
          Index Cond: (lower(identifier) ~=~ 'bugs0000001884677'::character varying)<br />              ->  Bitmap
IndexScan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual time=0.008..0.008 rows=0
loops=1)<br/>                    Index Cond: (lower(identifier) ~=~ 'bugs0000001884678'::character varying)<br />      
      ->  Bitmap Index Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual
time=0.008..0.008rows=0 loops=1)<br />                    Index Cond: (lower(identifier) ~=~
'bugs0000001884679'::charactervarying)<br />              ->  Bitmap Index Scan on in_dba_data_base_identifier
 (cost=0.00..32.64rows=964 width=0) (actual time=<a href="tel:71.347..71.347" value="+17134771347">71.347..71.347</a>
rows=318loops=1)<br />                    Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character
varying)<br/> Total runtime: 3208.904 ms<br /><br /><br /> Query 2<br /> -------<br /> datadb=#explain analyse SELECT
a.bioassay_id,a.identifier, ratio, log_ratio, p_value<br />  FROM<br />     dba_data_base a<br />  WHERE<br />    
lower(identifier)LIKE lower('BUGS0000001884677')<br /> UNION<br /> SELECT a.bioassay_id, a.identifier, ratio,
log_ratio,p_value<br />  FROM<br />     dba_data_base a<br />  WHERE<br />       lower(identifier) LIKE
lower('BUGS0000001884678')<br/> UNION<br /> SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_value<br />
 FROM<br/>     dba_data_base a<br />  WHERE<br />       lower(identifier) LIKE lower('BUGS0000001884679')<br />
UNION<br/> SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_value<br />  FROM<br />     dba_data_base a<br />
 WHERE<br/>       lower(identifier) LIKE lower('SpTIGR4-2210 (6F24)')<br />  ORDER BY identifier;<br /><br /> Sort
 (cost=15702.26..15711.90rows=3856 width=62) (actual time=3.688..3.886 rows=317 loops=1)<br />  Sort Key: identifier<br
/> ->  Unique  (cost=15414.74..15472.58 rows=3856 width=62) (actual time=2.663..3.387 rows=317 loops=1)<br />      
 -> Sort  (cost=15414.74..15424.38 rows=3856 width=62) (actual time=2.660..2.834 rows=318 loops=1)<br />            
 SortKey: bioassay_id, identifier, ratio, log_ratio, p_value<br />              ->  Append  (cost=32.88..15185.06
rows=3856width=62) (actual time=0.320..2.131 rows=318 loops=1)<br />                    ->  Bitmap Heap Scan on
dba_data_basea  (cost=32.88..3786.62 rows=964 width=62) (actual time=0.041..0.041 rows=0 loops=1)<br />                
        Filter: (lower(identifier) ~~ 'bugs0000001884677'::text)<br />                          ->  Bitmap Index
Scanon in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual time=0.036..0.036 rows=0 loops=1)<br />
                              Index Cond: (lower(identifier) ~=~ 'bugs0000001884677'::character varying)<br />        
          ->  Bitmap Heap Scan on dba_data_base a  (cost=32.88..3786.62 rows=964 width=62) (actual time=0.010..0.010
rows=0loops=1)<br />                          Filter: (lower(identifier) ~~ 'bugs0000001884678'::text)<br />          
              ->  Bitmap Index Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual
time=0.008..0.008rows=0 loops=1)<br />                                Index Cond: (lower(identifier) ~=~
'bugs0000001884678'::charactervarying)<br />                    ->  Bitmap Heap Scan on dba_data_base a
 (cost=32.88..3786.62rows=964 width=62) (actual time=0.010..0.010 rows=0 loops=1)<br />                        
 Filter:(lower(identifier) ~~ 'bugs0000001884679'::text)<br />                          ->  Bitmap Index Scan on
in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=0.008..0.008 rows=0 loops=1)<br />        
                      Index Cond: (lower(identifier) ~=~ 'bugs0000001884679'::character varying)<br />                
  ->  Bitmap Heap Scan on dba_data_base a  (cost=32.88..3786.62 rows=964 width=62) (actual time=0.255..1.676
rows=318loops=1)<br />                          Filter: (lower(identifier) ~~ 'sptigr4-2210 (6f24)'::text)<br />      
                  ->  Bitmap Index Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual
time=0.178..0.178rows=318 loops=1)<br />                                Index Cond: (lower(identifier) ~=~
'sptigr4-2210(6f24)'::character varying)<br /> Total runtime: 4.174 ms<br /><br /> Also which should scale better if I
addmore strings to match? would there be any better design patterns for this problem?<br /><br /> Thanks for any
help<br/><br /> Adam<br /><br /> select version();<br />                           version<br />
----------------------------------------------------------------<br/> PostgreSQL 8.2.12 on i686-pc-linux-gnu, compiled
byGCC 2.95.4<br /><font color="#888888"><br /><br /><br /> --<br /> Sent via pgsql-sql mailing list (<a
href="mailto:pgsql-sql@postgresql.org">pgsql-sql@postgresql.org</a>)<br/> To make changes to your subscription:<br /><a
href="http://www.postgresql.org/mailpref/pgsql-sql"target="_blank">http://www.postgresql.org/mailpref/pgsql-sql</a><br
/></font></blockquote></div><br/><br clear="all" /><br />-- <br />---------------------------------------<br />Viktor
Bojović<br/>---------------------------------------<br />Wherever I go, Murphy goes with me<br /> 

В списке pgsql-sql по дате отправления:

Предыдущее
От: adam_pgsql
Дата:
Сообщение: which is better: using OR clauses or UNION?
Следующее
От: Adrian Klaver
Дата:
Сообщение: Re: bigint and unix time