От: Tom Lane
Тема: Re: query optimization
Дата: ,
Msg-id: 13222.1335470935@sss.pgh.pa.us
(см: обсуждение, исходный текст)
Ответ на: Re: query optimization  (Thomas Kellerer)
Ответы: Re: query optimization  (Andrew Dunstan)
Список: pgsql-performance

Скрыть дерево обсуждения

query optimization  (Richard Kojedzinszky, )
 Re: query optimization  ("Kevin Grittner", )
 Re: query optimization  (Tom Lane, )
  Re: query optimization  (Thomas Kellerer, )
   Re: query optimization  (Tom Lane, )
    Re: query optimization  (Andrew Dunstan, )
  Re: query optimization  (Richard Kojedzinszky, )

Thomas Kellerer <> writes:
> Tom Lane wrote on 26.04.2012 21:17:
>> Um ... did you analyze all the tables, or just some of them?  I get
>> sub-millisecond runtimes if all four tables have been analyzed, but it
>> does seem to pick lousy plans if, say, only a and b have been analyzed.

> Here it's similar to Richard's experience:
> Before analyzing the four tables, the first statement yields this plan:
> [ merge joins ]
> This continues to stay the plan for about 10-15 repetitions, then it turns to this plan
> [ hash joins ]

Hmm.  I see it liking the merge-join plan (with minor variations) with
or without analyze data, but if just some of the tables have been
analyzed, it goes for the hash plan which is a good deal slower.  The
cost estimates aren't that far apart though.  In any case, the only
reason the merge join is so fast is that the data is perfectly ordered
in each table; on a less contrived example, it could well be a lot

> And the second one yields this one here (Regardless of analyze or not):

Yeah, the trick there is that it's valid to re-order the joins, since
they're both left joins.

In git HEAD I get something like this:

regression=# explain analyze select * from a left join (b inner join c on b.id = c.id) on a.b = b.id where a.id = 4;
                                                        QUERY PLAN

 Nested Loop Left Join  (cost=0.00..17.18 rows=1 width=16) (actual time=0.024..0.026 rows=1 loops=1)
   ->  Index Scan using a_idx1 on a  (cost=0.00..8.38 rows=1 width=8) (actual time=0.010..0.010 rows=1 loops=1)
         Index Cond: (id = 4)
   ->  Nested Loop  (cost=0.00..8.80 rows=1 width=8) (actual time=0.011..0.012 rows=1 loops=1)
         ->  Index Only Scan using b_idx1 on b  (cost=0.00..8.38 rows=1 width=4) (actual time=0.006..0.006 rows=1
               Index Cond: (id = a.b)
               Heap Fetches: 1
         ->  Index Only Scan using c_idx1 on c  (cost=0.00..0.41 rows=1 width=4) (actual time=0.004..0.005 rows=1
               Index Cond: (id = b.id)
               Heap Fetches: 1
 Total runtime: 0.080 ms
(11 rows)

but 9.1 and older are not smart enough to do it like that when they
can't re-order the joins.

            regards, tom lane

В списке pgsql-performance по дате сообщения:

От: Tom Lane
Сообщение: Re: Weird plan variation with recursive CTEs
От: AI Rumman
Сообщение: NOT EXISTS or LEFT JOIN which one is better?