Обсуждение: which is better: using OR clauses or UNION?

Поиск
Список
Период
Сортировка
which is better: using OR clauses or UNION?

От
Adam Witney
Дата:
16 августа 2011 г., 12:22:39
Hi,

I have a query hitting a table of 25 million rows. The table has a text field ('identifier') which i need to query for
matchingrows. The question is if i have multiple strings to match against this field I can use multiple OR
sub-statementsor multiple statements in a UNION. The UNION seems to run quicker.... is this to be expected? or is there
anythingelse I can do improve the speed of this query? Some query details: 


table "dba_data_base", index:
"in_dba_data_base_identifier" btree (lower(identifier) varchar_pattern_ops)


Query 1
-------
datadb=#explain analyse SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_value FROM     dba_data_base a WHERE
( lower(identifier) LIKE lower('BUGS0000001884677') OR       lower(identifier) LIKE lower('BUGS0000001884678') OR
lower(identifier)LIKE lower('BUGS0000001884679') OR       lower(identifier) LIKE lower('SpTIGR4-2210 (6F24)') ) ORDER
BYa.identifier;
    QUERY PLAN                                                                                


--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------Sort  (cost=14930.13..14939.77 rows=3857 width=62) (actual time=3208.466..3208.652
rows=318loops=1)  Sort Key: identifier  ->  Bitmap Heap Scan on dba_data_base a  (cost=134.43..14700.38 rows=3857
width=62)(actual time=81.106..3207.721 rows=318 loops=1)        Recheck Cond: ((lower(identifier) ~~
'bugs0000001884677'::text)OR (lower(identifier) ~~ 'bugs0000001884678'::text) OR (lower(identifier) ~~
'bugs0000001884679'::text)OR (lower(identifier) ~ 
~ 'sptigr4-2210 (6f24)'::text))        Filter: ((lower(identifier) ~~ 'bugs0000001884677'::text) OR (lower(identifier)
~~'bugs0000001884678'::text) OR (lower(identifier) ~~ 'bugs0000001884679'::text) OR (lower(identifier) ~~ 'spt 
igr4-2210 (6f24)'::text))        ->  BitmapOr  (cost=134.43..134.43 rows=3857 width=0) (actual time=71.397..71.397
rows=0loops=1)              ->  Bitmap Index Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0)
(actualtime=0.029..0.029 rows=0 loops=1)                    Index Cond: (lower(identifier) ~=~
'bugs0000001884677'::charactervarying)              ->  Bitmap Index Scan on in_dba_data_base_identifier
(cost=0.00..32.64rows=964 width=0) (actual time=0.008..0.008 rows=0 loops=1)                    Index Cond:
(lower(identifier)~=~ 'bugs0000001884678'::character varying)              ->  Bitmap Index Scan on
in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=0.008..0.008 rows=0 loops=1)
    Index Cond: (lower(identifier) ~=~ 'bugs0000001884679'::character varying)              ->  Bitmap Index Scan on
in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=71.347..71.347 rows=318 loops=1)
        Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character varying)Total runtime: 3208.904 ms 


Query 2
-------
datadb=#explain analyse SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_value FROM     dba_data_base a WHERE
lower(identifier) LIKE lower('BUGS0000001884677') 
UNION
SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_value FROM     dba_data_base a WHERE       lower(identifier)
LIKElower('BUGS0000001884678') 
UNION
SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_value FROM     dba_data_base a WHERE       lower(identifier)
LIKElower('BUGS0000001884679') 
UNION
SELECT a.bioassay_id, a.identifier, ratio, log_ratio, p_value FROM     dba_data_base a WHERE       lower(identifier)
LIKElower('SpTIGR4-2210 (6F24)') ORDER BY identifier; 
Sort  (cost=15702.26..15711.90 rows=3856 width=62) (actual time=3.688..3.886 rows=317 loops=1)  Sort Key: identifier
-> Unique  (cost=15414.74..15472.58 rows=3856 width=62) (actual time=2.663..3.387 rows=317 loops=1)        ->  Sort
(cost=15414.74..15424.38rows=3856 width=62) (actual time=2.660..2.834 rows=318 loops=1)              Sort Key:
bioassay_id,identifier, ratio, log_ratio, p_value              ->  Append  (cost=32.88..15185.06 rows=3856 width=62)
(actualtime=0.320..2.131 rows=318 loops=1)                    ->  Bitmap Heap Scan on dba_data_base a
(cost=32.88..3786.62rows=964 width=62) (actual time=0.041..0.041 rows=0 loops=1)                          Filter:
(lower(identifier)~~ 'bugs0000001884677'::text)                          ->  Bitmap Index Scan on
in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=0.036..0.036 rows=0 loops=1)
                Index Cond: (lower(identifier) ~=~ 'bugs0000001884677'::character varying)                    ->
BitmapHeap Scan on dba_data_base a  (cost=32.88..3786.62 rows=964 width=62) (actual time=0.010..0.010 rows=0 loops=1)
                      Filter: (lower(identifier) ~~ 'bugs0000001884678'::text)                          ->  Bitmap
IndexScan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual time=0.008..0.008 rows=0 loops=1)
                              Index Cond: (lower(identifier) ~=~ 'bugs0000001884678'::character varying)
   ->  Bitmap Heap Scan on dba_data_base a  (cost=32.88..3786.62 rows=964 width=62) (actual time=0.010..0.010 rows=0
loops=1)                         Filter: (lower(identifier) ~~ 'bugs0000001884679'::text)                          ->
BitmapIndex Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual time=0.008..0.008 rows=0
loops=1)                               Index Cond: (lower(identifier) ~=~ 'bugs0000001884679'::character varying)
            ->  Bitmap Heap Scan on dba_data_base a  (cost=32.88..3786.62 rows=964 width=62) (actual time=0.255..1.676
rows=318loops=1)                          Filter: (lower(identifier) ~~ 'sptigr4-2210 (6f24)'::text)
     ->  Bitmap Index Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual
time=0.178..0.178rows=318 loops=1)                                Index Cond: (lower(identifier) ~=~ 'sptigr4-2210
(6f24)'::charactervarying)Total runtime: 4.174 ms 

Also which should scale better if I add more strings to match? would there be any better design patterns for this
problem?

Thanks for any help

Adam

select version();                           version
----------------------------------------------------------------PostgreSQL 8.2.12 on i686-pc-linux-gnu, compiled by GCC
2.95.4
Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: which is better: using OR clauses or UNION?

which is better: using OR clauses or UNION?