Re: MIN/MAX optimization for partitioned table

Поиск

Список

Период

Сортировка

От	Alan Li
Тема	Re: MIN/MAX optimization for partitioned table
Дата	20 июля 2009 г. 18:41:12
Msg-id	782056770907201140h6c449398n6fb2bf19c3305fa8@mail.gmail.com обсуждение исходный текст
Ответ на	Re: MIN/MAX optimization for partitioned table (Greg Stark <gsstark@mit.edu>)
Ответы	Re: MIN/MAX optimization for partitioned table (Greg Stark <gsstark@mit.edu>)
Список	pgsql-hackers

Дерево обсуждения

On Sat, Jul 18, 2009 at 12:04 PM, Greg Stark <gsstark@mit.edu> wrote:

On Sat, Jul 18, 2009 at 6:35 PM, Alan Li<ali@truviso.com> wrote:
> Yeah, are you running into the same issue as well? I tried to figure out a
> way around the O(n^2) behavior, but there were no existing direct
> relationship between the child subpath and its associated AppendRelInfo. I
> think an AppenRelInfo dynahash would work and just have it hash by the
> child_relid.

I don't see any place in my patch where I had to do anything like
this. I do have to loop through all the appendrel elements and skip
over the ones which don't belong to this appendrel which seems weird
to me but it isn't n^2.

Generating a normal Append node you're generating a brand new path for
each child and attaching them to the append path. It looks like you're
allowing the append rel to generate paths and then digging around
looking at those paths. I wonder if that's the right approach.

>> The other thing is it would be nice if we could avoid making separate
>> subplans for each child and instead make one for the whole structure
>> including the aggregate. It would at the very least make the explain
>> output prettier, but I think it would avoid repeated execution of the
>> aggregate in some cases as well.
>
> How would this plan look? I think the repeated execution of the aggregate
> comes from having to process the output of each child. So it's O(n)
> executions where n is the number of children, assuming each child has the
> appropriate index for this optimization.

No I just mean instead of generating

InitPlan 1
Limit
Index Scan
InitPlan 2
Limit
Index Scan
Aggregate
Append
Result
Result

I think it would be better to generate this:

InitPlan 1
Aggregate
Append
Limit 1
IndexScan
Limit 1
IndexScan
Result

The reason I think this is better is because if the subquery is
executed multiple times it will just return the one precalculated
result. In your plan it will have all the child minimums precalculated
but will still have to re-execute the aggregate and append node.
That's not terribly expensive but we might as well generate the
simpler more efficient plan.

> The other optimization that I thought was that the optimizer might use the
> check constraints (in the range partitioning) on the column that I want to
> do a MIN/MAX on. Then it'll only have to execute the subplan in one child
> partition and the parent partition. But it was more of a wild idea on my
> part, not sure if that's possible.

Yes, I think this is the long-term goal. That's the whole "better
representation of partitions" plotline. To do this efficiently the
planner should know what the partition key is and what the
partitioning structure is.

In an ideal world we would be able to collapse the whole structure and
eliminate the append relation entirely. To do that we need some way to
indicate that the parent relation is provably empty. I had in mind to
mark it as read-only and the statistics as authoritative since that
seems more useful than just being able to mark it empty. I think there
are a lot of other interesting things we could do with statistics if
we knew they were authoritative for a read-only table. (The read-only
property we would need here is very strong. There would have to be a
vacuum freeze and moreover we would have to know that all tuples were
successfully frozen.)

--
greg
http://mit.edu/~gsstark/resume.pdf

Attached is an updated patch that removes the O(n^2) behavior and the awkwardness of optimizing the seqscan path as the plan was about to be created. Now, the optimization is considered when appendrel is generating the paths.

I also changed the plan as you had suggested. It now looks like this:

postgres=# explain select max(b) from foo_archive;
                                                                      QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1.22..1.23 rows=1 width=8)
   -> Append (cost=0.00..1.13 rows=32 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_idx on foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_01_idx on foo_archive_2007_01_01 foo_archive (cost=0.00..2723.24 rows=86399 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_02_idx on foo_archive_2007_01_02 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_03_idx on foo_archive_2007_01_03 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_04_idx on foo_archive_2007_01_04 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_05_idx on foo_archive_2007_01_05 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_06_idx on foo_archive_2007_01_06 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_07_idx on foo_archive_2007_01_07 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_08_idx on foo_archive_2007_01_08 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_09_idx on foo_archive_2007_01_09 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_10_idx on foo_archive_2007_01_10 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_11_idx on foo_archive_2007_01_11 foo_archive (cost=0.00..2719.26 rows=86400 width=8)
         -> Limit (cost=0.00..0.03 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_12_idx on foo_archive_2007_01_12 foo_archive (cost=0.00..1568.27 rows=49601 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_13_idx on foo_archive_2007_01_13 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_14_idx on foo_archive_2007_01_14 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_15_idx on foo_archive_2007_01_15 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_16_idx on foo_archive_2007_01_16 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_17_idx on foo_archive_2007_01_17 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_18_idx on foo_archive_2007_01_18 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_19_idx on foo_archive_2007_01_19 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_20_idx on foo_archive_2007_01_20 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_21_idx on foo_archive_2007_01_21 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_22_idx on foo_archive_2007_01_22 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_23_idx on foo_archive_2007_01_23 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_24_idx on foo_archive_2007_01_24 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_25_idx on foo_archive_2007_01_25 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_26_idx on foo_archive_2007_01_26 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_27_idx on foo_archive_2007_01_27 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_28_idx on foo_archive_2007_01_28 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_29_idx on foo_archive_2007_01_29 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_30_idx on foo_archive_2007_01_30 foo_archive (cost=0.00..73.35 rows=1940 width=8)
         -> Limit (cost=0.00..0.04 rows=1 width=8)
               -> Index Scan Backward using foo_archive_2007_01_31_idx on foo_archive_2007_01_31 foo_archive (cost=0.00..73.35 rows=1940 width=8)
(66 rows)

Вложения

optimize_append_minmax_1.patch

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Alvaro Herrera
Дата: 20 июля 2009 г., 18:36:47
Сообщение: Re: fmgroids.h not installed by "make install" in VPATH

Следующее

От: Peter Eisentraut
Дата: 20 июля 2009 г., 18:53:14
Сообщение: Re: [PATCH] SE-PgSQL/tiny rev.2193

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: MIN/MAX optimization for partitioned table

Вложения

Предыдущее

Следующее