Re: partition routing layering in nodeModifyTable.c

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: partition routing layering in nodeModifyTable.c
Дата
Msg-id 73fe6ce6-d2a1-bd4b-17cf-9deba4e91467@iki.fi
обсуждение исходный текст
Ответ на Re: partition routing layering in nodeModifyTable.c  (Amit Langote <amitlangote09@gmail.com>)
Ответы Re: partition routing layering in nodeModifyTable.c  (Amit Langote <amitlangote09@gmail.com>)
Список pgsql-hackers
On 13/07/2020 08:47, Amit Langote wrote:
> It's been over 11 months since there was any significant commentary on
> the contents of the patches themselves, so perhaps I should reiterate
> what the patches are about and why it might still be a good idea to
> consider them.
> 
> The thread started with some very valid criticism of the way
> executor's partition tuple routing logic looks randomly sprinkled over
> in nodeModifyTable.c, execPartition.c.  In the process of making it
> look less random, we decided to get rid of the global variable
> es_result_relation_info to avoid complex maneuvers of
> setting/resetting it correctly when performing partition tuple
> routing, causing some other churn beside the partitioning code.  Same
> with another global variable TransitionCaptureState.tcs_map.  So, the
> patches neither add any new capabilities, nor improve performance, but
> they do make the code in this area a bit easier to follow.
> 
> Actually, there is a problem that some of the changes here conflict
> with patches being discussed on other threads ([1], [2]), so much so
> that I decided to absorb some changes here into another "refactoring"
> patch that I have posted at [2].

Thanks for the summary. It's been a bit hard to follow what depends on 
what across these threads, and how they work together. It seems that 
this patch set is the best place to start.

> Attached rebased patches.
> 
> 0001 contains preparatory FDW API changes to stop relying on
> es_result_relation_info being set correctly.

Makes sense. The only thing I don't like about this is the way the 
ForeignScan->resultRelIndex field is set. make_foreignscan() initializes 
it to -1, and the FDW's PlanDirectModify() function is expected to set 
it, like you did in postgres_fdw:

> @@ -2319,6 +2322,11 @@ postgresPlanDirectModify(PlannerInfo *root,
>              rebuild_fdw_scan_tlist(fscan, returningList);
>      }
>  
> +    /*
> +     * Set the index of the subplan result rel.
> +     */
> +    fscan->resultRelIndex = subplan_index;
> +
>      table_close(rel, NoLock);
>      return true;
>  }

It has to be set to that value (subplan_index is an argument to 
PlanDirectModify()), the FDW doesn't have any choice there, so this is 
just additional boilerplate code that has to be copied to every FDW that 
implements direct modify. Furthermore, if the FDW doesn't set it 
correctly, you could have some very interesting results, like updating 
wrong table. It would be better to set it in make_modifytable(), just 
after calling PlanDirectModify().

I'm also a bit unhappy with the way it's updated in set_plan_refs():

> --- a/src/backend/optimizer/plan/setrefs.c
> +++ b/src/backend/optimizer/plan/setrefs.c
> @@ -904,6 +904,13 @@ set_plan_refs(PlannerInfo *root, Plan *plan, int rtoffset)
>                      rc->rti += rtoffset;
>                      rc->prti += rtoffset;
>                  }
> +                /*
> +                 * Caution: Do not change the relative ordering of this loop
> +                 * and the statement below that adds the result relations to
> +                 * root->glob->resultRelations, because we need to use the
> +                 * current value of list_length(root->glob->resultRelations)
> +                 * in some plans.
> +                 */
>                  foreach(l, splan->plans)
>                  {
>                      lfirst(l) = set_plan_refs(root,
> @@ -1243,6 +1250,14 @@ set_foreignscan_references(PlannerInfo *root,
>      }
>  
>      fscan->fs_relids = offset_relid_set(fscan->fs_relids, rtoffset);
> +
> +    /*
> +     * Adjust resultRelIndex if it's valid (note that we are called before
> +     * adding the RT indexes of ModifyTable result relations to the global
> +     * list)
> +     */
> +    if (fscan->resultRelIndex >= 0)
> +        fscan->resultRelIndex += list_length(root->glob->resultRelations);
>  }
>  
>  /*

That "Caution" comment is well deserved, but could we make this more 
robust to begin with? The most straightforward solution would be to pass 
down the "current resultRelIndex" as an extra parameter to 
set_plan_refs(), similar to rtoffset. If we did that, we wouldn't 
actually need to set it before setrefs.c processing at all.

I'm a bit wary of adding another argument to set_plan_refs() because 
that's a lot of code churn, but it does seem like the most natural 
solution to me. Maybe create a new context struct to hold the 
PlannerInfo, rtoffset, and the new "currentResultRelIndex" value, 
similar to fix_scan_expr_context, to avoid passing through so many 
arguments.


Another idea is to merge "resultRelIndex" and a "range table index" into 
one value. Range table entries that are updated would have a 
ResultRelInfo, others would not. I'm not sure if that would end up being 
cleaner or messier than what we have now, but might be worth trying.

> 0002 removes es_result_relation_info in favor passing the active
> result relation around as a parameter in the various functions that
> need it

Looks good.

> 0003 Moves UPDATE tuple-routing logic into a new function
> 
> 0004 removes the global variable TransitionCaptureState.tcs_map which
> needed to be set/reset whenever the active result relation relation
> changes in favor of a new field in ResultRelInfo to store the same map

I didn't look closely, but these make sense at a quick glance.

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: James Coleman
Дата:
Сообщение: Re: enable_incremental_sort changes query behavior
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: [HACKERS] Custom compression methods