Re: Parallel Inserts in CREATE TABLE AS

Поиск
Список
Период
Сортировка
От Bharath Rupireddy
Тема Re: Parallel Inserts in CREATE TABLE AS
Дата
Msg-id CALj2ACU8CHmE20dJhnr125s1oCugtgUwJSUbsLkJi3=ymnMKyQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Inserts in CREATE TABLE AS  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
On Thu, Oct 15, 2020 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > > > 1. How to represent the parallel insert for CTAS in explain plans? The
> > > > explain CTAS shows the plan for only the SELECT part. How about having
> > > > some textual info along with the Gather node? I'm not quite sure on
> > > > this point, any suggestions are welcome.
> > >
> > > I am also not sure about this point because we don't display anything
> > > for the DDL part in explain. Can you propose by showing some example
> > > of what you have in mind?
> >
> > I thought we could have something like this.
> >  -----------------------------------------------------------------------------
> >      Gather  (cost=1000.00..108738.90 rows=0 width=8)
> >      Workers Planned: 2 Parallel Insert on t_test1
> >         ->  Parallel Seq Scan on t_test  (cost=0.00..106748.00 rows=4954 width=8)
> >              Filter: (many < 10000)
> >  -----------------------------------------------------------------------------
>
> maybe something like below:
> Gather  (cost=1000.00..108738.90 rows=0 width=8)
>    -> Create t_test1
>        ->  Parallel Seq Scan on t_test
>
> I don't know what is the best thing to do here. I think for the
> temporary purpose you can keep something like above then once the
> patch is matured then we can take a separate opinion for this.
>

Agreed. Here's a snapshot of explain with the change suggested.

postgres=# EXPLAIN (ANALYZE, COSTS OFF) CREATE TABLE t1_test AS SELECT * FROM t1;
                                   QUERY PLAN                            
---------------------------------------------------------------------------------
 Gather (actual time=970.524..972.913 rows=0 loops=1)
   ->  Create t1_test
     Workers Planned: 2
     Workers Launched: 2
     ->  Parallel Seq Scan on t1 (actual time=0.028..86.623 rows=333333 loops=3)
 Planning Time: 0.049 ms
 Execution Time: 973.733 ms

>
> I think there is no reason why one can't use ORDER BY in the
> statements we are talking about here. But, I think we can't enable
> parallelism for GatherMerge is because for that node we always need to
> fetch the data in the leader backend to perform the final merge phase.
> So, I was expecting a small comment saying something on those lines.
>

Added comments.

>
> 2. Addition of new test cases.
>

Added new test cases.

>
> Analysis on the 2 mismatches in write_parallel.sql regression test.
>

Done. It needed a small code change in costsize.c. Now, both make check and make check-world passes.

Apart from above, a couple of other things I have finished with the v3 patch.

1. Both make check and make check-world with force_parallel_mode = regress passes.
2. I enabled parallel inserts in case of materialized views. Hope that's fine.

Attaching v3 patch herewith.

I'm done with all the open points in my list. Please review the v3 patch and provide comments.

With Regards,
Bharath Rupireddy.
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Justin Pryzby
Дата:
Сообщение: Re: Probable documentation errors or improvements
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Probable documentation errors or improvements