Re: [HACKERS] Partition-wise aggregation/grouping

Поиск
Список
Период
Сортировка
Hi,

In this attached version, I have rebased my changes over new design of partially_grouped_rel. The preparatory changes of adding partially_grouped_rel are in 0001.

Also to minimize finalization code duplication, I have refactored them into two separate functions, finalize_sorted_partial_agg_path() and finalize_hashed_partial_agg_path(). I need to create these two functions as current path creation order in like,
    Sort Agg Path
    Sort Agg Path - Parallel Aware (Finalization needed here)
    Hash Agg Path
    Hash Agg Path - Parallel Aware (Finalization needed here)
And if we club those finalizations together, then path creation order will be changed and it may result in the existing plan changes.
Let me know if that's OK, I will merge them together as they are distinct anyways. These changes are part of 0002.

0003 - 0006 are refactoring patches as before.

0007 is the main patch per new design. I have removed create_partition_agg_paths() altogether as finalization code is reused. Also, renamed preferFullAgg with forcePartialAgg as we forcefully needed a partial path from nested level if the parent is doing a partial aggregation. add_single_path_to_append_rel() is no more exists and also there is no need to pass OtherUpperPathExtraData to add_paths_to_append_rel().

0008 - 0009, testcase and postgres_fdw changes.

Please have a look at new changes and let me know if I missed any.

Thanks


On Fri, Feb 2, 2018 at 7:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Feb 2, 2018 at 8:25 AM, Jeevan Chalke
<jeevan.chalke@enterprisedb.com> wrote:
>> The problem is that create_partition_agg_paths() is doing *exactly*
>> same thing that add_paths_to_grouping_rel() is already doing inside
>> the blocks that say if (grouped_rel->partial_pathlist).  We don't need
>> two copies of that code.  Both of those places except to take a
>> partial path that has been partially aggregated and produce a
>> non-partial path that is fully aggregated.  We do not need or want two
>> copies of that code.
>
> OK. Got it.
>
> Will try to find a common place for them and will also check how it goes
> with your suggested design change.
>
>> Here's another way to look at it.  We have four kinds of things.
>>
>> 1. Partially aggregated partial paths
>> 2. Partially aggregated non-partial paths
>> 3. Fully aggregated partial paths
>> 4. Fully aggregated non-partial paths

So in the new scheme I'm proposing, you've got a partially_grouped_rel
and a grouped_rel.  So all paths of type #1 go into
partially_grouped_rel->partial_pathlist, paths of type #2 go into
partially_grouped_rel->pathlist, type #3 (if we have any) goes into
grouped_rel->partial_pathlist, and type #4 goes into
grouped_rel->pathlist.

> add_paths_to_append_rel() -> generate_mergeappend_paths() does not consider
> partial_pathlist. Thus we will never see MergeAppend over parallel scan
> given by partial_pathlist. And thus plan like:
> -> Gather Merge
>   -> MergeAppend
> is not possible with current HEAD.
>
> Are you suggesting we should implement that here? I think that itself is a
> separate task.

Oh, I didn't realize that wasn't working already.  I agree that it's a
separate task from this patch, but it's really too bad that it doesn't
already work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: SSL test names
Следующее
От: Stephen Frost
Дата:
Сообщение: Re: PDF Builds on borka (Debian/stretch) broken - 9.6