Re: Parallel Inserts in CREATE TABLE AS

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: Parallel Inserts in CREATE TABLE AS
Дата
Msg-id CAFiTN-uXKA8ZnRvPRDhp-yft-3W-TJ_7R8b9DzoLZzWa7rThtQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Inserts in CREATE TABLE AS  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Fri, Dec 25, 2020 at 10:04 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Dec 25, 2020 at 9:54 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> >
> > On Fri, Dec 25, 2020 at 7:12 AM vignesh C <vignesh21@gmail.com> wrote:
> > > On Thu, Dec 24, 2020 at 11:29 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Thu, Dec 24, 2020 at 10:25 AM vignesh C <vignesh21@gmail.com> wrote:
> > > > >
> > > > > On Tue, Dec 22, 2020 at 2:16 PM Bharath Rupireddy
> > > > > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > > > > >
> > > > > > On Tue, Dec 22, 2020 at 12:32 PM Bharath Rupireddy
> > > > > > Attaching v14 patch set that has above changes. Please consider this
> > > > > > for further review.
> > > > > >
> > > > >
> > > > > Few comments:
> > > > > In the below case, should create be above Gather?
> > > > > postgres=# explain  create table t7 as select * from t6;
> > > > >                             QUERY PLAN
> > > > > -------------------------------------------------------------------
> > > > >  Gather  (cost=0.00..9.17 rows=0 width=4)
> > > > >    Workers Planned: 2
> > > > >  ->  Create t7
> > > > >    ->  Parallel Seq Scan on t6  (cost=0.00..9.17 rows=417 width=4)
> > > > > (4 rows)
> > > > >
> > > > > Can we change it to something like:
> > > > > -------------------------------------------------------------------
> > > > > Create t7
> > > > >  -> Gather  (cost=0.00..9.17 rows=0 width=4)
> > > > >   Workers Planned: 2
> > > > >   ->  Parallel Seq Scan on t6  (cost=0.00..9.17 rows=417 width=4)
> > > > > (4 rows)
> > > > >
> > > >
> > > > I think it is better to have it in a way as in the current patch
> > > > because that reflects that we are performing insert/create below
> > > > Gather which is the purpose of this patch. I think this is similar to
> > > > what the Parallel Insert patch [1] has for a similar plan.
> > > >
> > > >
> > > > [1] - https://commitfest.postgresql.org/31/2844/
> > > >
> > >
> > > Also another thing that I felt was that actually the Gather nodes will actually do the insert operation, the
Createtable will be done earlier itself. Should we change Create table to Insert table something like below:
 
> > >                              QUERY PLAN
> > > -------------------------------------------------------------------
> > >  Gather  (cost=0.00..9.17 rows=0 width=4)
> > >    Workers Planned: 2
> > >  ->  Insert table2 (instead of Create table2)
> > >    ->  Parallel Seq Scan on table1  (cost=0.00..9.17 rows=417 width=4)
> >
> > IMO, showing Insert under Gather makes sense if the query is INSERT
> > INTO SELECT as it's in the other patch [1]. Since here it is a CTAS
> > query, so having Create under Gather looks fine to me. This way we can
> > also distinguish the EXPLAINs of parallel inserts in INSERT INTO
> > SELECT and CTAS.
> >
>
> Right, IIRC, we have done the way it is in the patch for convenience
> and to move forward with it and come back to it later once all other
> parts of the patch are good.
>
> > And also, some might wonder that Create under Gather means that each
> > parallel worker is creating the table, it's actually not the creation
> > of the table that's parallelized but it's insertion. If required, we
> > can clarify it in CTAS docs with a sample EXPLAIN. I have not yet
> > added docs related to allowing parallel inserts in CTAS. Shall I add a
> > para saying when parallel inserts can be picked and how the sample
> > EXPLAIN looks? Thoughts?
> >
>
> Yeah, I don't see any problem with it, and maybe we can move  Explain
> related code to a separate patch. The reason is we don't display DDL
> part without parallelism and this might need a separate discussion.
>

This makes sense to me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Parallel Inserts in CREATE TABLE AS
Следующее
От: "Tang, Haiying"
Дата:
Сообщение: RE: [Patch] Optimize dropping of relation buffers using dlist