Re: Parallel INSERT (INTO ... SELECT ...)

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: Parallel INSERT (INTO ... SELECT ...)
Дата
Msg-id CA+HiwqFU13k43Q_gPJHocmjc8wsmQBgyqjo6-cVjKx4gUH+V3A@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Ответы Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Список pgsql-hackers
On Tue, Feb 9, 2021 at 10:30 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
> On Tue, Feb 9, 2021 at 1:04 AM Amit Langote <amitlangote09@gmail.com> wrote:
> > That brings me to to this part of the hunk:
> >
> > +   /*
> > +    * If there is no underlying SELECT, a parallel table-modification
> > +    * operation is not possible (nor desirable).
> > +    */
> > +   hasSubQuery = false;
> > +   foreach(lc, parse->rtable)
> > +   {
> > +       rte = lfirst_node(RangeTblEntry, lc);
> > +       if (rte->rtekind == RTE_SUBQUERY)
> > +       {
> > +           hasSubQuery = true;
> > +           break;
> > +       }
> > +   }
> > +   if (!hasSubQuery)
> > +       return PROPARALLEL_UNSAFE;
> >
> > The justification for this given in:
> >
> > https://www.postgresql.org/message-id/CAJcOf-dF9ohqub_D805k57Y_AuDLeAQfvtaax9SpwjTSEVdiXg%40mail.gmail.com
> >
> > seems to be that the failure of a test case in
> > partition-concurrent-attach isolation suite is prevented if finding no
> > subquery RTEs in the query is flagged as parallel unsafe, which in
> > turn stops max_parallel_hazard_modify() from locking partitions for
> > safety checks in such cases.  But it feels unprincipled to have this
> > code to work around a specific test case that's failing.  I'd rather
> > edit the failing test case to disable parallel execution as
> > Tsunakawa-san suggested.
> >
>
> The code was not changed because of the test case (though it was
> fortunate that the test case worked after the change).

Ah, I misread then, sorry.

> The code check that you have identified above ensures that the INSERT
> has an underlying SELECT, because the planner won't (and shouldn't
> anyway) generate a parallel plan for INSERT...VALUES, so there is no
> point doing any parallel-safety checks in this case.
> It just so happens that the problem test case uses INSERT...VALUES -
> and it shouldn't have triggered the parallel-safety checks for
> parallel INSERT for this case anyway, because INSERT...VALUES can't
> (and shouldn't) be parallelized.

AFAICS, max_parallel_hazard() path never bails from doing further
safety checks based on anything other than finding a query component
whose hazard level crosses context->max_interesting.  You're trying to
add something that bails based on second-guessing that a parallel path
would not be chosen, which I find somewhat objectionable.

If the main goal of bailing out is to avoid doing the potentially
expensive modification safety check on the target relation, maybe we
should try to somehow make the check less expensive.  I remember
reading somewhere in the thread about caching the result of this check
in relcache, but haven't closely studied the feasibility of doing so.

--
Amit Langote
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andy Fan
Дата:
Сообщение: Re: Keep notnullattrs in RelOptInfo (Was part of UniqueKey patch series)
Следующее
От: Greg Nancarrow
Дата:
Сообщение: Re: Parallel INSERT (INTO ... SELECT ...)