Re: which is better: using OR clauses or UNION?

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: which is better: using OR clauses or UNION?
Дата
Msg-id 26139.1313503748@sss.pgh.pa.us
обсуждение исходный текст
Ответ на which is better: using OR clauses or UNION?  (adam_pgsql <adam_pgsql@witneyweb.org>)
Ответы Re: which is better: using OR clauses or UNION?  (adam_pgsql <adam_pgsql@witneyweb.org>)
Список pgsql-sql
adam_pgsql <adam_pgsql@witneyweb.org> writes:
> I have a query hitting a table of 25 million rows. The table has a
> text field ('identifier') which i need to query for matching rows. The
> question is if i have multiple strings to match against this field I
> can use multiple OR sub-statements or multiple statements in a
> UNION. The UNION seems to run quicker.... is this to be expected?

Your test cases don't seem exactly comparable; in particular I think the
second one is benefiting from the first one having already read and
cached the relevant disk blocks.  Notice how you've got, eg,

>               ->  Bitmap Index Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual
time=71.347..71.347rows=318 loops=1)
 
>                     Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character varying)

versus

>                           ->  Bitmap Index Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0)
(actualtime=0.178..0.178 rows=318 loops=1)
 
>                                 Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character varying)

Those are the exact same subplan, so any honest comparison should be
finding them to take the same amount of time.  When the actual readings
are different by a factor of several hundred, there's something wrong
with your measurement process.

In the end this comes down to whether duplicates will be eliminated more
efficiently by a BitmapOr step or by sort/uniq on the resulting rows.
I'd have to bet on the BitmapOr myself, but it's likely that this is
down in the noise compared to the actual disk accesses in any
not-fully-cached scenario.  Also, if you don't expect the sub-statements
to yield any duplicates, or don't care about seeing the same row twice
in the output, you should consider UNION ALL instead of UNION.
        regards, tom lane


В списке pgsql-sql по дате отправления:

Предыдущее
От: "David Johnston"
Дата:
Сообщение: Re: which is better: using OR clauses or UNION?
Следующее
От: Jasen Betts
Дата:
Сообщение: Re: which is better: using OR clauses or UNION?