Outer join query plans and performance

Поиск
Список
Период
Сортировка
От Rich Doughty
Тема Outer join query plans and performance
Дата
Msg-id 435E186A.3060003@opusvl.com
обсуждение исходный текст
Ответы Re: Outer join query plans and performance  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-performance
I tried on pgsql-general but got no reply. re-posting here as it's
probably the best place to ask

I'm having some significant performance problems with left join. Can
anyone give me any pointers as to why the following 2 query plans are so
different?


EXPLAIN SELECT *
FROM
     tokens.ta_tokens      t  LEFT JOIN
     tokens.ta_tokenhist   h1 ON t.token_id = h1.token_id LEFT JOIN
     tokens.ta_tokenhist   h2 ON t.token_id = h2.token_id
WHERE
     h1.histdate = 'now';


  Nested Loop Left Join  (cost=0.00..68778.43 rows=2215 width=1402)
    ->  Nested Loop  (cost=0.00..55505.62 rows=2215 width=714)
          ->  Index Scan using idx_tokenhist__histdate on ta_tokenhist h1  (cost=0.00..22970.70 rows=5752 width=688)
                Index Cond: (histdate = '2005-10-24 13:28:38.411844'::timestamp without time zone)
          ->  Index Scan using ta_tokens_pkey on ta_tokens t  (cost=0.00..5.64 rows=1 width=26)
                Index Cond: ((t.token_id)::integer = ("outer".token_id)::integer)
    ->  Index Scan using fkx_tokenhist__tokens on ta_tokenhist h2  (cost=0.00..5.98 rows=1 width=688)
          Index Cond: (("outer".token_id)::integer = (h2.token_id)::integer)


Performance is fine for this one and the plan is pretty much as i'd
expect.

This is where i hit a problem.


EXPLAIN SELECT *
FROM
     tokens.ta_tokens      t  LEFT JOIN
     tokens.ta_tokenhist   h1 ON t.token_id = h1.token_id LEFT JOIN
     tokens.ta_tokenhist   h2 ON t.token_id = h2.token_id
WHERE
     h2.histdate = 'now';


  Hash Join  (cost=1249148.59..9000709.22 rows=2215 width=1402)
    Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer)
    ->  Hash Left Join  (cost=1225660.51..8181263.40 rows=4045106 width=714)
          Hash Cond: (("outer".token_id)::integer = ("inner".token_id)::integer)
          ->  Seq Scan on ta_tokens t  (cost=0.00..71828.06 rows=4045106 width=26)
          ->  Hash  (cost=281243.21..281243.21 rows=10504921 width=688)
                ->  Seq Scan on ta_tokenhist h1  (cost=0.00..281243.21 rows=10504921 width=688)
    ->  Hash  (cost=22970.70..22970.70 rows=5752 width=688)
          ->  Index Scan using idx_tokenhist__histdate on ta_tokenhist h2  (cost=0.00..22970.70 rows=5752 width=688)
                Index Cond: (histdate = '2005-10-24 13:34:51.371905'::timestamp without time zone)


I would understand if h2 was joined on h1, but it isn't. It only joins
on t. can anyone give any tips on improving the performance of the second
query (aside from changing the join order manually)?


select version();
                                                    version
--------------------------------------------------------------------------------------------------------------
  PostgreSQL 8.0.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.2 20050821 (prerelease) (Debian 4.0.1-6)


Thanks

--

   - Rich Doughty

В списке pgsql-performance по дате отправления:

Предыдущее
От: Richard Huxton
Дата:
Сообщение: Re: Strange planner decision on quite simple select
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Why Index is not working on date columns.