Обсуждение: LEFT JOIN optimization

Поиск
Список
Период
Сортировка

LEFT JOIN optimization

От
Ksenia Marasanova
Дата:
Hi list,

I don't have much experience with Postgres optimization, somehow I was
happily avoiding anything more difficult than simple select statement,
and it was working all right.

Now LEFT JOIN must be used, and I am not happy with the performance:
It takes about 5 seconds to run very simple LEFT JOIN query on a table
"user_" with ~ 13.000 records left joined to table "church" with ~ 300
records on Powerbook PPC 1.67 GHz with 1.5 GB ram.
Is it normal?

Some details:

test=# explain select * from user_ left join church on user_.church_id
= church.id;
                             QUERY PLAN
---------------------------------------------------------------------
 Hash Left Join  (cost=6.44..7626.69 rows=12763 width=325)
   Hash Cond: ("outer".church_id = "inner".id)
   ->  Seq Scan on user_  (cost=0.00..7430.63 rows=12763 width=245)
   ->  Hash  (cost=5.75..5.75 rows=275 width=80)
         ->  Seq Scan on church  (cost=0.00..5.75 rows=275 width=80)
(5 rows)


From what I understand, it doesn't use foreign key index on user_
table. So I tried:

mydb=# set enable_seqscan='false';
SET
mydb=# explain select * from user_ left join church on user_.church_id
= church.id;
                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 Merge Right Join  (cost=0.00..44675.77 rows=12763 width=325)
   Merge Cond: ("outer".id = "inner".church_id)
   ->  Index Scan using chirch_pkey on church  (cost=0.00..17.02
rows=275 width=80)
   ->  Index Scan using user__church_id on user_  (cost=0.00..44500.34
rows=12763 width=245)
(4 rows)


It's my first time reading Query plans, but from wat I understand, it
doesn't make the query faster..

Any tips are greatly appreciated.
--
Ksenia

Re: LEFT JOIN optimization

От
Stephen Frost
Дата:
* Ksenia Marasanova (ksenia.marasanova@gmail.com) wrote:
> Any tips are greatly appreciated.

EXPLAIN ANALYZE of the same queries would be much more useful.

    Thanks,

        Stephen

Вложения

Re: LEFT JOIN optimization

От
Ksenia Marasanova
Дата:
2005/9/12, Stephen Frost <sfrost@snowman.net>:
> * Ksenia Marasanova (ksenia.marasanova@gmail.com) wrote:
> > Any tips are greatly appreciated.
>
> EXPLAIN ANALYZE of the same queries would be much more useful.

Thanks, here it is:

test=# explain analyze select * from user_ left join church on
user_.church_id = church.id;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=6.44..7626.69 rows=12763 width=325) (actual
time=388.573..2016.929 rows=12763 loops=1)
   Hash Cond: ("outer".church_id = "inner".id)
   ->  Seq Scan on user_  (cost=0.00..7430.63 rows=12763 width=245)
(actual time=360.431..1120.012 rows=12763 loops=1)
   ->  Hash  (cost=5.75..5.75 rows=275 width=80) (actual
time=27.985..27.985 rows=0 loops=1)
         ->  Seq Scan on church  (cost=0.00..5.75 rows=275 width=80)
(actual time=0.124..26.953 rows=275 loops=1)
 Total runtime: 2025.946 ms
(6 rows)

test=# set enable_seqscan='false';
SET
test=# explain analyze select * from user_ left join church on
user_.church_id = church.id;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------
 Merge Right Join  (cost=0.00..44675.77 rows=12763 width=325) (actual
time=0.808..2119.099 rows=12763 loops=1)
   Merge Cond: ("outer".id = "inner".church_id)
   ->  Index Scan using chirch_pkey on church  (cost=0.00..17.02
rows=275 width=80) (actual time=0.365..5.471 rows=275 loops=1)
   ->  Index Scan using user__church_id on user_  (cost=0.00..44500.34
rows=12763 width=245) (actual time=0.324..1243.348 rows=12763 loops=1)
 Total runtime: 2131.364 ms
(5 rows)


I followed some tips on the web and vacuum-ed database, I think the
query is  faster now, almost acceptable, but still interesting to know
if it possible to optimize it...

Thanks again,
--
Ksenia

Re: LEFT JOIN optimization

От
Stephen Frost
Дата:
* Ksenia Marasanova (ksenia.marasanova@gmail.com) wrote:
> test=# explain analyze select * from user_ left join church on
> user_.church_id = church.id;
[...]
>  Total runtime: 2025.946 ms
> (6 rows)
>
> test=# set enable_seqscan='false';
> SET
> test=# explain analyze select * from user_ left join church on
> user_.church_id = church.id;
>
[...]
>  Total runtime: 2131.364 ms
> (5 rows)
>
>
> I followed some tips on the web and vacuum-ed database, I think the
> query is  faster now, almost acceptable, but still interesting to know
> if it possible to optimize it...

I have to say that it does seem a bit slow for only 12,000 rows..
What's the datatype of user_.church_id and church.id?  Are you sure you
really want all 12,000 rows every time you run that query?  Perhaps
there's a 'where' clause you could apply with an associated index to
limit the query to just what you actually need?

You'll noticed from above, though, that the non-index scan is faster.
I'd expect that when using a left-join query: you have to go through the
entire table on an open left-join like that, a sequencial scan is going
to be the best way to do that.  The fact that church.id is hashed makes
the solution the planner came up with almost certainly the best one
possible.

Are you sure a left-join is what you want?  Sounds like maybe you've
moved (for some reason) from a regular join to a left join with a
filtering in the application which is probably a bad move...  If you can
use at least some filtering in the database I expect that'd help..

    Thanks,

        Stephen

Вложения

Re: LEFT JOIN optimization

От
Manfred Koizar
Дата:
On Mon, 12 Sep 2005 00:47:57 +0300, Ksenia Marasanova
<ksenia.marasanova@gmail.com> wrote:
>   ->  Seq Scan on user_  (cost=0.00..7430.63 rows=12763 width=245)
>(actual time=360.431..1120.012 rows=12763 loops=1)

If 12000 rows of the given size are stored in more than 7000 pages, then
there is a lot of free space in these pages.  Try VACUUM FULL ...

Servus
 Manfred