Optimisation of INTERSECT expressions

Поиск
Список
Период
Сортировка
От Phil Endecott
Тема Optimisation of INTERSECT expressions
Дата
Msg-id 20040323141200.34D86D1EC02@svr1.postgresql.org
обсуждение исходный текст
Ответы Re: Optimisation of INTERSECT expressions  (Stephan Szabo <sszabo@megazone.bigpanda.com>)
Список pgsql-performance
Dear PostgresQL Experts,

I am trying to get to the bottom of some efficiency problems and hope that
you can help.  The difficulty seems to be with INTERSECT expressions.

I have a query of the form
     select A from T where C1 intersect select A from T where C2;
It runs in about 100 ms.

But it is equivalent to this query
     select A from T where C1 and C2;
which runs in less than 10 ms.

Looking at the output of "explain analyse" on the first query, it seems
that PostgresQL always computes the two sub-expressions and then computes
an explicit intersection on the results.  I had hoped that it would notice
that both subexpressions are scanning the same input table T and convert
the expression to the second form.

Is there a reason why it can't do this transformation?

(Of course, I could just re-write my code to use the second form, but my
application is generating these bits of SQL programmatically, and it is not
trivial as in some cases the two tables are not the same so an intersection
really is needed; if PostgresQL can do it for me, that would be much
better.  I don't want to write an SQL parser!)


While studying the same code I found another case where my INTERSECT
expressions don't seem to be optimised as much as I'd like.  In this case,
one of the subexpressions being intersected is empty much of the time.  But
even when it is empty, PostgresQL computes the other (expensive)
subexpression and does an intersect.  Could PostgresQL do something like this:

- guess which subexpression is likely to produce fewest rows
- compute this subexpression
- if empty, return now with an empty result
- compute other subexpression
- compute intersection
- return intersection

Alternatively, it could be defined that the left subexpression is always
computed first and the second not computed if it is empty, like the
behaviour of logical AND and OR operators in C.

Thanks in advance for any suggestions.

--Phil.


В списке pgsql-performance по дате отправления:

Предыдущее
От: Mark Kirkwood
Дата:
Сообщение: Re: Benchmarking postgres on Solaris/Linux
Следующее
От: Stephan Szabo
Дата:
Сообщение: Re: Optimisation of INTERSECT expressions