Dear all,
Is there a way to efficiently perform OR conditions across multiple
joins?
For example, I have the following statement:
SELECT RECORD.id
FROM RECORD
left join string
ON string.record_id = RECORD.id
AND string.layout_id = 6
left join DATE
ON DATE.record_id = RECORD.id
AND DATE.layout_id = 8
WHERE ( string.value_index = 'beverley'
OR DATE.value = '1980-11-16' );
Which when analysed produces the output at the end of this email. It
appears to be doing a slow filter across all values, once it's
completed the joins across all rows.
If I change the OR condition to an AND condition, then indexes are used
to good effect and the query is very fast.
Does anyone have any advice as to how I can efficiently use indexes for
the OR condition and substantially improve the query time? I can
provide full details of all my current indexes if that helps.
PostgreSQL 9.4.8 on Debian 8.5
Thanks,
Andy
Hash Right Join (cost=176359.93..266784.76 rows=135 width=8) (actual time=17189.826..17845.939 rows=15 loops=1)
Hash Cond: (string.record_id = record.id)
Filter: (((string.value_index)::text = 'beverley'::text) OR (date.value = '1980-11-16'::date))
Rows Removed by Filter: 2094748
-> Bitmap Heap Scan on string (cost=6683.45..76441.68 rows=305938 width=19) (actual time=45.417..681.269
rows=285099loops=1)
Recheck Cond: (layout_id = 6)
Heap Blocks: exact=19548
-> Bitmap Index Scan on string_idx_layout_id (cost=0.00..6606.97 rows=305938 width=0) (actual
time=39.527..39.527rows=285103 loops=1)
Index Cond: (layout_id = 6)
-> Hash (cost=133376.36..133376.36 rows=2088250 width=12) (actual time=15917.540..15917.540 rows=2094763 loops=1)
Buckets: 16384 Batches: 32 Memory Usage: 2611kB
-> Hash Right Join (cost=71333.62..133376.36 rows=2088250 width=12) (actual time=7041.758..12268.693
rows=2094763loops=1)
Hash Cond: (date.record_id = record.id)
-> Seq Scan on date (cost=0.00..46095.40 rows=285676 width=12) (actual time=0.871..769.841 rows=285099
loops=1)
Filter: (layout_id = 8)
Rows Removed by Filter: 855297
-> Hash (cost=37072.50..37072.50 rows=2088250 width=8) (actual time=7032.415..7032.415 rows=2094763
loops=1)
Buckets: 16384 Batches: 32 Memory Usage: 2576kB
-> Seq Scan on record (cost=0.00..37072.50 rows=2088250 width=8) (actual time=0.037..3398.638
rows=2094763loops=1)