Re: Parallel INSERT (INTO ... SELECT ...)

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: Parallel INSERT (INTO ... SELECT ...)
Дата
Msg-id CA+HiwqG01zK2GJBQHKOJ0CFxGXAM7EowmZwSycORtXQisf2PEg@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Parallel INSERT (INTO ... SELECT ...)  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Ответы RE: Parallel INSERT (INTO ... SELECT ...)  ("tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com>)
Список pgsql-hackers
On Wed, Feb 10, 2021 at 5:03 PM tsunakawa.takay@fujitsu.com
<tsunakawa.takay@fujitsu.com> wrote:
> From: Amit Langote <amitlangote09@gmail.com>
> > On Wed, Feb 10, 2021 at 1:35 PM Greg Nancarrow <gregn4422@gmail.com>
> > wrote:
> > > There's no "second-guessing" involved here.
> > > There is no underlying way of dividing up the VALUES data of
> > > "INSERT...VALUES" amongst the parallel workers, even if the planner
> > > was updated to produce a parallel-plan for the "INSERT...VALUES" case
> > > (apart from the fact that spawning off parallel workers to insert that
> > > data would almost always result in worse performance than a
> > > non-parallel plan...)
> > > The division of work for parallel workers is part of the table AM
> > > (scan) implementation, which is not invoked for "INSERT...VALUES".
> >
> > I don't disagree that the planner would not normally assign a parallel
> > path simply to pull values out of a VALUES list mentioned in the
> > INSERT command, but deciding something based on the certainty of it in
> > an earlier planning phase seems odd to me.  Maybe that's just me
> > though.
>
> In terms of competitiveness, Oracle does not run INSERT VALUES in parallel:
>
>
https://docs.oracle.com/en/database/oracle/oracle-database/21/vldbg/types-parallelism.html#GUID-6626C70C-876C-47A4-8C01-9B66574062D8
>
> "The INSERT VALUES statement is never executed in parallel."
>
>
> And SQL Server doesn't either:
>
> https://docs.microsoft.com/en-us/sql/relational-databases/query-processing-architecture-guide?view=sql-server-ver15
>
> "Starting with SQL Server 2016 (13.x) and database compatibility level 130, the INSERT … SELECT statement can be
executedin parallel when inserting into heaps or clustered columnstore indexes (CCI), and using the TABLOCK hint." 

Just to be clear, I'm not suggesting that we should put effort into
making INSERT ... VALUES run in parallel.  I'm just raising my concern
about embedding the assumption in max_parallel_hazard() that it will
never make sense to do so.

Although, maybe there are other more pressing concerns to resolve, so
I will not insist too much on doing anything about this.

--
Amit Langote
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Markus Wanner
Дата:
Сообщение: Re: repeated decoding of prepared transactions
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Parallel INSERT (INTO ... SELECT ...)