Обсуждение: inner join is much faster! is that right?

Поиск
Список
Период
Сортировка

inner join is much faster! is that right?

От
"James Im"
Дата:
I really thought that Postgresql would rewrite a query from

select *
from file a, file_tag b, tag c
where a.id_file=b.id_file and b.id_tag=c.id_tag and a.id_file=10000001000000

to something like:

select *
from (file a inner join file_tag b on (a.id_file=b.id_file)) inner join
tag c on (b.id_tag=c.id_tag)
where a.id_file=10000001000000


However this is not the case!

I have a many-to-many relation between 2 tables (tag and file). tag has
40000 rows and file has 5 millions rows. file_tag has 50 millions rows.

In this setup, the first select takes a couple of minutes to complete
while the second query takes only a second to complete.

Is this a normal expected behavior? Shouldn't Postgresql be able to
rewrite the query correctly?

_________________________________________________________________
F� 250 MB gratis lagerplads p� MSN Hotmail:  http://www.hotmail.com



Re: inner join is much faster! is that right?

От
Chris Mair
Дата:
> I really thought that Postgresql would rewrite a query from
> 
> select *
> from file a, file_tag b, tag c
> where a.id_file=b.id_file and b.id_tag=c.id_tag and a.id_file=10000001000000
> 
> to something like:
> 
> select *
> from (file a inner join file_tag b on (a.id_file=b.id_file)) inner join
> tag c on (b.id_tag=c.id_tag)
> where a.id_file=10000001000000

These shouldn't be other than two ways to express the same (inner)
join.

Your timings seem to suggest that in the first case PG computes
the whole join between 3 tables and only then applies the filter
in a.

Can you send the outputs of "explain <query>" for these two and
let us know what version of PG this is?

Bye, Chris.


-- 

Chris Mair
http://www.1006.org



Re: inner join is much faster! is that right?

От
"James Im"
Дата:
I was doing the explain thing when I discovered my mistake!

The think is that I over simplyfied my examples. instead of searching
for one ID, I was searching for several ids and I forgot to put a
parentesis arround the or like this.

I was doing this:
select *
from file a, file_tag b, tag c
where a.id_file=b.id_file and b.id_tag=c.id_tag and
a.id_file=10000001000000 or a.id_file=10000001000200

But I should have done this:

select *
from file a, file_tag b, tag c
where a.id_file=b.id_file and b.id_tag=c.id_tag and
(a.id_file=10000001000000 or a.id_file=10000001000200)

This changes everything. The performance is now acceptable!

_________________________________________________________________
V�lg selv hvordan du vil kommunikere - skrift, tale, video eller billeder 
med MSN Messenger:  http://messenger.msn.dk/  - her kan du det hele