Обсуждение: CROSS JOIN performance

Поиск
Список
Период
Сортировка

CROSS JOIN performance

От
Andy Chambers
Дата:
Hi,

In our porting of a big mysql app to postgres, we're finding lots of
queries like

select foo
  from (foo f, bar b)
left join caz c on f.id = f.caz_id
where f.id = b.foo_id

I've seen the message where Tom explains why this is invalid in ANSI
SQL so I converted it to

select foo
  from foo f CROSS JOIN bar b
left join caz c on f.id = f.caz_id
where f.id = b.foo_id

...and it works.  However, sometimes quite slowly.  When we've looked
into the slow ones, we've found that changing it again to

select foo
  from foo f INNER JOIN bar b ON f.id = b.foo_id
left join caz c on f.id = f.caz_id

makes it perform much better.

Furthermore, we're starting to find that performance of the 3rd is
significantly better than the 2nd, *ONLY* when the CROSS JOINs are
followed by more joins (like in this case).  If there are no more
tables being joined, changing to the 3rd version yields no performance
gain.

Are these three queries logically equivalent (well, at least the
latter two since the first isn't valid SQL)?  If so, does it make
sense that the optimizer has difficulty with the second case.

Cheers,
Andy

--
Andy Chambers

Re: CROSS JOIN performance

От
Tom Lane
Дата:
Andy Chambers <achambers@mcna.net> writes:
> In our porting of a big mysql app to postgres, we're finding lots of
> queries like

> select foo
>   from (foo f, bar b)
> left join caz c on f.id = f.caz_id
> where f.id = b.foo_id

> I've seen the message where Tom explains why this is invalid in ANSI
> SQL so I converted it to

> select foo
>   from foo f CROSS JOIN bar b
> left join caz c on f.id = f.caz_id
> where f.id = b.foo_id

> ...and it works.  However, sometimes quite slowly.  When we've looked
> into the slow ones, we've found that changing it again to

> select foo
>   from foo f INNER JOIN bar b ON f.id = b.foo_id
> left join caz c on f.id = f.caz_id

> makes it perform much better.

Those formulations look equivalent to me.  Could you provide a concrete
test case, or at least some specific queries and their EXPLAIN ANALYZE
results?  Which PG version are we talking about?  Oh, and are the LEFT
JOIN conditions really not constraining table c at all?

            regards, tom lane