Re: Parallel INSERT (INTO ... SELECT ...)

Поиск
Список
Период
Сортировка
От Greg Nancarrow
Тема Re: Parallel INSERT (INTO ... SELECT ...)
Дата
Msg-id CAJcOf-e_tzVzF+kp7HDM+UMNrfM+DUL0Ck3boOdp-DDygt5Jww@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Parallel INSERT (INTO ... SELECT ...)  ("Hou, Zhijie" <houzj.fnst@cn.fujitsu.com>)
Список pgsql-hackers
On Mon, Feb 1, 2021 at 8:19 PM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
>
>
> When developing the reloption patch, I noticed some issues in the patch.
>
> 1).
> > - Reduce Insert parallel-safety checks required for some SQL, by noting
> > that the subquery must operate on a relation (check for RTE_RELATION in
> > subquery range-table)
>
> +                       foreach(lcSub, rte->subquery->rtable)
> +                       {
> +                               rteSub = lfirst_node(RangeTblEntry, lcSub);
> +                               if (rteSub->rtekind == RTE_RELATION)
> +                               {
> +                                       hasSubQueryOnRelation = true;
> +                                       break;
> +                               }
> +                       }
> It seems we can not only search RTE_RELATION in rtable,
> because RTE_RELATION may exist in other place like:
>
> ---
> --** explain insert into target select (select * from test);
>         Subplan's subplan
>
> --** with cte as (select * from test) insert into target select * from cte;
>         In query's ctelist.
> ---
>
> May be we should use a walker function [1] to
> search the subquery and ctelist.
>

Yes, the current checks are too simple, as you point out, there seem
to be more complex cases that it doesn't pick up. Unfortunately
expanding the testing for them does detract from the original
intention of this code (which was to avoid extra parallel-safety check
processing on code which can't be run in parallel). I guess the
relation walker function should additionally check for SELECT queries
only (commandType == CMD_SELECT), and exclude SELECT FOR UPDATE/SHARE
(rowMarks != NIL) too. I'll need to look further into it, but will
certainly update the code for the next version of the patch.

> 2).
>
> +--
> +-- Test INSERT into temporary table with underlying query.
> +-- (should not use a parallel plan)
> +--
>
> May be the comment here need some change since
> we currently support parallel plan for temp table.
>

Thanks, it should say something like "should create the plan with
INSERT + parallel SELECT".

> 3)
> Do you think we can add a testcase for foreign-table ?
> To test parallel query with serial insert on foreign table.
>

I have intended to do it, but as a lower-priority task.

>
> [1]
> static bool
> relation_walker(Node *node)
> {
>         if (node == NULL)
>                 return false;
>
>         else if (IsA(node, RangeTblEntry))
>         {
>                 RangeTblEntry *rte = (RangeTblEntry *) node;
>                 if (rte->rtekind == RTE_RELATION)
>                         return true;
>
>                 return false;
>         }
>
>         else if (IsA(node, Query))
>         {
>                 Query      *query = (Query *) node;
>
>                 /* Recurse into subselects */
>                 return query_tree_walker(query, relation_walker,
>                                                                  NULL, QTW_EXAMINE_RTES_BEFORE);
>         }
>
>         /* Recurse to check arguments */
>         return expression_tree_walker(node,
>                                                                   relation_walker,
>                                                                   NULL);
> }
>

Regards,
Greg Nancarrow
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Single transaction in the tablesync worker?
Следующее
От: Yugo NAGATA
Дата:
Сообщение: Re: [PATCH] Add extra statistics to explain for Nested Loop