Optimise OR condiditions across multiple joins

Поиск
Список
Период
Сортировка
От Andrew Beverley
Тема Optimise OR condiditions across multiple joins
Дата
Msg-id 20160609071034.93b9227e59ced514cb9669d8@andybev.com
обсуждение исходный текст
Список pgsql-general
Dear all,

Is there a way to efficiently perform OR conditions across multiple
joins?

For example, I have the following statement:

SELECT RECORD.id
FROM   RECORD
       left join string
              ON string.record_id = RECORD.id
                 AND string.layout_id = 6
       left join DATE
              ON DATE.record_id = RECORD.id
                 AND DATE.layout_id = 8
WHERE  ( string.value_index = 'beverley'
          OR DATE.value = '1980-11-16' );

Which when analysed produces the output at the end of this email. It
appears to be doing a slow filter across all values, once it's
completed the joins across all rows.

If I change the OR condition to an AND condition, then indexes are used
to good effect and the query is very fast.

Does anyone have any advice as to how I can efficiently use indexes for
the OR condition and substantially improve the query time? I can
provide full details of all my current indexes if that helps.

PostgreSQL 9.4.8 on Debian 8.5

Thanks,

Andy


 Hash Right Join  (cost=176359.93..266784.76 rows=135 width=8) (actual time=17189.826..17845.939 rows=15 loops=1)
   Hash Cond: (string.record_id = record.id)
   Filter: (((string.value_index)::text = 'beverley'::text) OR (date.value = '1980-11-16'::date))
   Rows Removed by Filter: 2094748
   ->  Bitmap Heap Scan on string  (cost=6683.45..76441.68 rows=305938 width=19) (actual time=45.417..681.269
rows=285099loops=1) 
         Recheck Cond: (layout_id = 6)
         Heap Blocks: exact=19548
         ->  Bitmap Index Scan on string_idx_layout_id  (cost=0.00..6606.97 rows=305938 width=0) (actual
time=39.527..39.527rows=285103 loops=1) 
               Index Cond: (layout_id = 6)
   ->  Hash  (cost=133376.36..133376.36 rows=2088250 width=12) (actual time=15917.540..15917.540 rows=2094763 loops=1)
         Buckets: 16384  Batches: 32  Memory Usage: 2611kB
         ->  Hash Right Join  (cost=71333.62..133376.36 rows=2088250 width=12) (actual time=7041.758..12268.693
rows=2094763loops=1) 
               Hash Cond: (date.record_id = record.id)
               ->  Seq Scan on date  (cost=0.00..46095.40 rows=285676 width=12) (actual time=0.871..769.841 rows=285099
loops=1)
                     Filter: (layout_id = 8)
                     Rows Removed by Filter: 855297
               ->  Hash  (cost=37072.50..37072.50 rows=2088250 width=8) (actual time=7032.415..7032.415 rows=2094763
loops=1)
                     Buckets: 16384  Batches: 32  Memory Usage: 2576kB
                     ->  Seq Scan on record  (cost=0.00..37072.50 rows=2088250 width=8) (actual time=0.037..3398.638
rows=2094763loops=1) 



В списке pgsql-general по дате отправления:

Предыдущее
От: John R Pierce
Дата:
Сообщение: Re: Automate copy - Postgres 9.2
Следующее
От: Greg Navis
Дата:
Сообщение: Re: [pg_trgm] Making similarity(?, ?) < ? use an index