Re: Parallel Inserts in CREATE TABLE AS

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel Inserts in CREATE TABLE AS
Дата
Msg-id CAA4eK1KWSVYn8hxnTKZfcKiHxhqmj842V1tK9f88mYMCGgGxtQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Ответы Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
On Wed, Oct 14, 2020 at 2:46 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, Oct 6, 2020 at 10:58 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > While skimming through the patch, a small thing I noticed:
> > + /*
> > + * SELECT part of the CTAS is parallelizable, so we can make
> > + * each parallel worker insert the tuples that are resulted
> > + * in it's execution into the target table.
> > + */
> > + if (!is_matview &&
> > + IsA(plan->planTree, Gather))
> > + ((DR_intorel *) dest)->is_parallel = true;
> > +
> >
> > I am not sure at this stage if this is the best way to make CTAS as
> > parallel but if so, then probably you can expand the comments a bit to
> > say why you consider only Gather node (and that too when it is the
> > top-most node) and why not another parallel node like GatherMerge?
> >
>
> If somebody expects to preserve the order of the tuples that are
> coming from GatherMerge node of the select part in CTAS or SELECT INTO
> while inserting, now if parallelism is allowed, that may not be the
> case i.e. the order of insertion of tuples may vary. I'm not quite
> sure, if someone wants to use order by in the select parts of CTAS or
> SELECT INTO in a real world use case. Thoughts?
>

I think there is no reason why one can't use ORDER BY in the
statements we are talking about here. But, I think we can't enable
parallelism for GatherMerge is because for that node we always need to
fetch the data in the leader backend to perform the final merge phase.
So, I was expecting a small comment saying something on those lines.

>
> >
> > Need to analyze whether to allow parallelism if CTAS has prepared statements or with no data.
> >
>
> For prepared statements, the parallelism will not be picked and so is
> parallel insertion.
>

Hmm, I am not sure what makes you say this statement. The parallelism
is enabled for prepared statements since commit 57a6a72b6b.

>
> I'm listing the things that are still pending.
>
> 1. How to represent the parallel insert for CTAS in explain plans? The
> explain CTAS shows the plan for only the SELECT part. How about having
> some textual info along with the Gather node? I'm not quite sure on
> this point, any suggestions are welcome.
>

I am also not sure about this point because we don't display anything
for the DDL part in explain. Can you propose by showing some example
of what you have in mind?

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Ajin Cherian
Дата:
Сообщение: Re: [HACKERS] logical decoding of two-phase transactions
Следующее
От: vignesh C
Дата:
Сообщение: Re: Parallel copy