Re: which is better: using OR clauses or UNION?

Поиск
Список
Период
Сортировка
От adam_pgsql
Тема Re: which is better: using OR clauses or UNION?
Дата
Msg-id 75BEB82D-3473-4225-86D8-880AB7A04D56@witneyweb.org
обсуждение исходный текст
Ответ на Re: which is better: using OR clauses or UNION?  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: which is better: using OR clauses or UNION?  ("Igor Neyman" <ineyman@perceptron.com>)
Список pgsql-sql
On 16 Aug 2011, at 15:09, Tom Lane wrote:

> adam_pgsql <adam_pgsql@witneyweb.org> writes:
>> I have a query hitting a table of 25 million rows. The table has a
>> text field ('identifier') which i need to query for matching rows. The
>> question is if i have multiple strings to match against this field I
>> can use multiple OR sub-statements or multiple statements in a
>> UNION. The UNION seems to run quicker.... is this to be expected?
>
> Your test cases don't seem exactly comparable; in particular I think the
> second one is benefiting from the first one having already read and
> cached the relevant disk blocks.  Notice how you've got, eg,
>
>>              ->  Bitmap Index Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0) (actual
time=71.347..71.347rows=318 loops=1) 
>>                    Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character varying)
>
> versus
>
>>                          ->  Bitmap Index Scan on in_dba_data_base_identifier  (cost=0.00..32.64 rows=964 width=0)
(actualtime=0.178..0.178 rows=318 loops=1) 
>>                                Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character varying)
>
> Those are the exact same subplan, so any honest comparison should be
> finding them to take the same amount of time.  When the actual readings
> are different by a factor of several hundred, there's something wrong
> with your measurement process.
>
> In the end this comes down to whether duplicates will be eliminated more
> efficiently by a BitmapOr step or by sort/uniq on the resulting rows.
> I'd have to bet on the BitmapOr myself, but it's likely that this is
> down in the noise compared to the actual disk accesses in any
> not-fully-cached scenario.  Also, if you don't expect the sub-statements
> to yield any duplicates, or don't care about seeing the same row twice
> in the output, you should consider UNION ALL instead of UNION.


Thanks guys, I'll give some of those options a try and see which ones improve performance

(Tom, yes i ran those queries after each other so there was caching going on. However, I had noticed a difference in
performancewhen spacing the queries before and after a few other big queries to help clear the cache). 

adam





В списке pgsql-sql по дате отправления:

Предыдущее
От: Janiv Ratson
Дата:
Сообщение: Re: bigint and unix time
Следующее
От: "Igor Neyman"
Дата:
Сообщение: Re: which is better: using OR clauses or UNION?