Re: [HACKERS] Partition-wise aggregation/grouping

Поиск

Список

Период

Сортировка

От	Jeevan Chalke
Тема	Re: [HACKERS] Partition-wise aggregation/grouping
Дата	2 января 2018 г. 13:06:36
Msg-id	CAM2+6=Uqc1FSaGcD7WsLmMmpvY8WirwUu0eNXjOzABsG2orXMg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] Partition-wise aggregation/grouping (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

Attached patchset with all the review comments reported so far.

On Fri, Dec 1, 2017 at 6:11 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

Continuing with review of 0007.

+
+ /* Copy input rels's relids to grouped rel */
+ grouped_rel->relids = input_rel->relids;

Isn't this done in fetch_upper_rel()? Why do we need it here?
There's also a similar hunk in create_grouping_paths() which doesn't look
appropriate. I guess, you need relids in grouped_rel->relids for FDW. There are
two ways to do this: 1. set grouped_rel->relids for parent upper rel as well,
but then we should pass relids to fetch_upper_rel() instead of setting those
later. 2. For a parent upper rel, in create_foreignscan_plan(), set relids to
all_baserels, if upper_rel->relids is NULL and don't set relids for a parent
upper rel. I am fine with either of those.

Done. Opted second option.

+ /* partial phase */
+ get_agg_clause_costs(root, (Node *) partial_target->exprs,
+ AGGSPLIT_INITIAL_SERIAL,
+ &agg_partial_costs);

IIUC, the costs for evaluating aggregates would not change per child. They
won't be different for parent and any of the children. So, we should be able to
calculate those once, save in "extra" and use repeatedly.

Yep. Done.

+ if (can_sort)
+ {
+ Path *path = cheapest_path;
+
+ if (!(pathkeys_contained_in(root->group_pathkeys,
+ path->pathkeys)))
[ .. clipped patch .. ]
+ NIL,
+ dNumGroups));
+ }

We create two kinds of paths partial paths for parallel query and partial
aggregation paths when group keys do not have partition keys. The comments and
code uses partial to mean both the things, which is rather confusing. May be we
should use term "partial aggregation" explicitly wherever it means that in
comments and in variable names.

Agree. Used "partial aggregation" wherever applicable. Let me know if you see any other place need this adjustments.

I still feel that create_grouping_paths() and create_child_grouping_paths()
have a lot of common code. While I can see that there are some pockets in
create_grouping_paths() which are not required in create_child_grouping_paths()
and vice-versa, may be we should create only one function and move those
pockets under "if (child)" or "if (parent)" kind of conditions. It will be a
maintenance burden to keep these two functions in sync in future. If we do not
keep them in sync, that will introduce bugs.

Agree that keeping these two functions in sync in future will be a maintenance burden, but I am not yet sure how to refactor them cleanly.

Will give one more try and update those changes in the next patchset.

+
+/*
+ * create_partition_agg_paths
+ *
+ * Creates append path for all the partitions and adds into the grouped rel.

I think you want to write "Creates an append path containing paths from all the
child grouped rels and adds into the given parent grouped rel".

Reworded as you said.

+ * For partial mode we need to add a finalize agg node over append path before
+ * adding a path to grouped rel.
+ */
+void
+create_partition_agg_paths(PlannerInfo *root,
+ RelOptInfo *grouped_rel,
+ List *subpaths,
+ List *partial_subpaths,

Why do we have these two as separate arguments? I don't see any call to
create_partition_agg_paths() through add_append_path() passing both of them
non-NULL simultaneously. May be you want use a single list subpaths and another
boolean indicating whether it's list of partial paths or regular paths.

After redesigning in the area of putting gather over append, I don't need to pass all Append subpaths to this function at-all. Append is done by add_paths_to_append_rel() itself. This function now just adds fanalization steps as needed.

So, we don't have two lists now. And to know about partial paths, passed a boolean instead. Please have a look and let me know if I missed any.

+
+ /* For non-partial path, just create a append path and we are done. */
This is the kind of confusion, I am talking about above. Here you have
mentioned "non-partial path" which may mean a regular path but what you
actually mean by that term is a path representing partial aggregates.

+ /*
+ * Partial partition-wise grouping paths. Need to create final
+ * aggregation path over append relation.
+ *
+ * If there are partial subpaths, then we need to add gather path before we
+ * append these subpaths.

More confusion here.

Hopefully no more confusion in this new version.

+ */
+ if (partial_subpaths)
+ {
+ ListCell *lc;
+
+ Assert(subpaths == NIL);
+
+ foreach(lc, partial_subpaths)
+ {
+ Path *path = lfirst(lc);
+ double total_groups = path->rows * path->parallel_workers;
+
+ /* Create gather paths for partial subpaths */
+ Path *gpath = (Path *) create_gather_path(root, grouped_rel, path,
+ path->pathtarget, NULL,
+ &total_groups);
+
+ subpaths = lappend(subpaths, gpath);

Using the argument variable is confusing and that's why you need two different
List variables. Instead probably you could have another variable local to this
function to hold the gather subpaths.

Done.

AFAIU, the Gather paths that this code creates has its parent set to
parent grouped
rel. That's not correct. These partial paths come from children of grouped rel
and each gather is producing rows corresponding to one children of grouped rel.
So gather path's parent should be set to corresponding child and not parent
grouped rel.

Yep.

This code creates plans where there are multiple Gather nodes under an Append
node. AFAIU, the workers assigned to one gather node can be reused until that
Gather node finishes. Having multiple Gather nodes under an Append mean that
every worker will be idle from the time that worker finishes the work till the
last worker finishes the work. That doesn't seem to be optimal use of workers.
The plan that we create with Gather on top of Append seems to be better. So, we
should avoid creating one Gather node per child plans. Have we tried to compare
performance of these two plans?

Agree. Having Gather on top of the Append is better. Done that way. It resolves your previous comment too.

+ if (!IsA(apath, MergeAppendPath) && root->group_pathkeys)
+ {
+ spath = (Path *) create_sort_path(root,
+ grouped_rel,
+ apath,
+ root->group_pathkeys,
+ -1.0);
+ }

The code here assumes that a MergeAppend path will always have pathkeys
matching group_pathkeys. I believe that's true but probably we should have an
Assert to make it clear and add comments. If that's not true, we will need to
sort the output of MergeAppend OR discard MergeAppend paths which do not have
pathkeys matching group_pathkeys.

Oops. Thanks for pointing out that. You are correct.

Added relevant check which checks for required pathkeys present or not.

diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index b422050..1941468 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2345,6 +2345,7 @@ UnlistenStmt
UnresolvedTup
UnresolvedTupData
UpdateStmt
+UpperPathExtraData
UpperRelationKind
UpperUniquePath
UserAuth

Do we commit this file as part of the feature?
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

This patchset contains fixes for other review comments too.

Thanks

Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Вложения

partition-wise-agg-v9.tar.gz

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Mithun Cy
Дата: 02 января 2018 г., 12:09:16
Сообщение: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager

Следующее

От: Jeevan Chalke
Дата: 02 января 2018 г., 13:30:34
Сообщение: Re: [HACKERS] Partition-wise aggregation/grouping

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] Partition-wise aggregation/grouping

Вложения

Предыдущее

Следующее