Re: MIN/MAX optimization for partitioned table

Поиск
Список
Период
Сортировка
От Alan Li
Тема Re: MIN/MAX optimization for partitioned table
Дата
Msg-id 782056770907201309l46d4a1d9ue427f36896d69a67@mail.gmail.com
обсуждение исходный текст
Ответ на Re: MIN/MAX optimization for partitioned table  (Greg Stark <gsstark@mit.edu>)
Список pgsql-hackers


On Mon, Jul 20, 2009 at 12:29 PM, Greg Stark <gsstark@mit.edu> wrote:
On Mon, Jul 20, 2009 at 7:40 PM, Alan Li<ali@truviso.com> wrote:
> Attached is an updated patch that removes the O(n^2) behavior and the
> awkwardness of optimizing the seqscan path as the plan was about to be
> created.  Now, the optimization is considered when appendrel is generating
> the paths.
>
> I also changed the plan as you had suggested.  It now looks like this:

Hm, that's not quite the plan I described either. I had in mind to
mirror the existing min/max optimization which put the whole thing in
an InitPlan and put a Result node in place of the actual plan. Your
original patch did that for each subplan but I thought it would be
better to do it for the whole aggregate.

However the more I think about it the more I don't understand why Tom
arranged to do that for the min/max optimization anyways. For
subqueries where that makes sense that would surely happen anyways
such as in the first example below. And for joins where it's necessary
the planner knows to put a Materialize node which sounds just as good. 

Here's what happens with your current patch in the case I was
concerned about -- note that the planner automatically detects the
case and turns the whole subplan into an initplan anyways:

postgres=# explain select * from y where j = (select min(i) from x) ;
                                         QUERY PLAN
----------------------------------------------------------------------------------------------
 Seq Scan on y  (cost=40.12..80.12 rows=12 width=4)
  Filter: (j = $0)
  InitPlan 1 (returns $0)
    ->  Aggregate  (cost=40.11..40.12 rows=1 width=4)
          ->  Append  (cost=0.00..34.10 rows=2403 width=4)
                ->  Limit  (cost=0.00..0.03 rows=1 width=4)
                      ->  Index Scan using xi on x  (cost=0.00..80.25
rows=2400 width=4)
                ->  Limit  (cost=0.00..0.03 rows=1 width=4)
                      ->  Index Scan using xi2 on x2 x
(cost=0.00..80.25 rows=2400 width=4)
                ->  Limit  (cost=0.00..0.03 rows=1 width=4)
                      ->  Index Scan using xi3 on x3 x
(cost=0.00..80.25 rows=2400 width=4)
                ->  Seq Scan on x1 x  (cost=0.00..34.00 rows=2400 width=4)
(12 rows)


And here's another case where you wouldn't want multiple execution --
but the planner here figures out to materialize the result:

postgres=# explain select * from y left outer join (select min(i) as i
from x) as x on (i=j);
                                           QUERY PLAN
--------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=40.13..128.13 rows=2400 width=8)
  Join Filter: ((min(public.x.i)) = y.j)
  ->  Seq Scan on y  (cost=0.00..34.00 rows=2400 width=4)
  ->  Materialize  (cost=40.13..40.14 rows=1 width=4)
        ->  Aggregate  (cost=40.11..40.12 rows=1 width=4)
              ->  Append  (cost=0.00..34.10 rows=2403 width=4)
                    ->  Limit  (cost=0.00..0.03 rows=1 width=4)
                          ->  Index Scan using xi on x
(cost=0.00..80.25 rows=2400 width=4)
                    ->  Limit  (cost=0.00..0.03 rows=1 width=4)
                          ->  Index Scan using xi2 on x2 x
(cost=0.00..80.25 rows=2400 width=4)
                    ->  Limit  (cost=0.00..0.03 rows=1 width=4)
                          ->  Index Scan using xi3 on x3 x
(cost=0.00..80.25 rows=2400 width=4)
                    ->  Seq Scan on x1 x  (cost=0.00..34.00 rows=2400 width=4)
(13 rows)


So I'm a bit puzzled why Tom's min/max optimization bothers with the
whole Initplan/Result business itself anyways.

--

Yes, this is what I saw when I was testing this, and I think that it would be best to leave the decision of creating an initPlan/materialize to the optimization of the super-query.  That's why I didn't bother to specifically put an InitPlan on top of the Aggregate in the new patch.

Alan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: WIP: Deferrable unique constraints
Следующее
От: Andrew Dunstan
Дата:
Сообщение: Re: [PATCH] SE-PgSQL/tiny rev.2193