On 12/10/2013 01:45, BladeOfLight16 wrote:
> On Fri, Oct 11, 2013 at 9:32 AM, Jesse Long <jpl@unknown.za.net
> <mailto:jpl@unknown.za.net>> wrote:
>
> explain select * from archive_document_index where node_id = 29
> and value = 'BSH70002152';
> QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------
> Index Scan using archive_document_index_node_id_value_idx on
> archive_document_index (cost=0.57..36.13 rows=14 width=33)
> Index Cond: ((node_id = 29) AND ((value)::text =
> 'BSH70002152'::text))
> (2 rows)
>
>
> I believe that this is what Tom is suggesting:
>
> SELECT *
> FROM ARCHIVE_DOCUMENT AS ad
> WHERE ad.NODE_ID = 29
> AND ad.ARCHIVE_DATE >= '2013-07-08 18:28:00'
> AND EXISTS (SELECT *
> FROM ARCHIVE_DOCUMENT_INDEX AS adi
> WHERE adi.ARCHIVE_ID = ad.ID
> AND adi.NODE_ID = ad.NODE_ID
> AND (adi.VALUE = 'BSH70002152' OR adi.VALUE = 'TC212592')
> )
> ORDER BY ad.ARCHIVE_DATE DESC
> LIMIT 10;
>
> Forgive my changing of the aliases. I found names like r0 difficult to
> interpret easily.
>
> This is a much simpler query since it only requires one subselect, and
> I believe Tom is suggesting that this query may be able to make use of
> the index or at least find some more efficient plan. How does this
> perform?
Indeed it does perform very much better. With the modified query the
plan and the execution time are excellent.
However, my previous question remains - in the original query plan,
there are two hash tables being populated using seqscan + filter. During
each seqscan, over 95 million records were wastefully read. I think that
this could have been dramatically improved by using an index scan. Am I
mistaken? Is it a optimisation that could be implemented but has not
been implemented yet? Is it a bug?
Thanks,
Jesse