Re: Parallel Inserts in CREATE TABLE AS

Поиск

Список

Период

Сортировка

От	Bharath Rupireddy
Тема	Re: Parallel Inserts in CREATE TABLE AS
Дата	19 октября 2020 г. 17:17:01
Msg-id	CALj2ACU8CHmE20dJhnr125s1oCugtgUwJSUbsLkJi3=ymnMKyQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Parallel Inserts in CREATE TABLE AS (Amit Kapila <amit.kapila16@gmail.com>)
Ответы	Re: Parallel Inserts in CREATE TABLE AS
Список	pgsql-hackers

Дерево обсуждения

On Thu, Oct 15, 2020 at 3:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > > > 1. How to represent the parallel insert for CTAS in explain plans? The
> > > > explain CTAS shows the plan for only the SELECT part. How about having
> > > > some textual info along with the Gather node? I'm not quite sure on
> > > > this point, any suggestions are welcome.
> > >
> > > I am also not sure about this point because we don't display anything
> > > for the DDL part in explain. Can you propose by showing some example
> > > of what you have in mind?
> >
> > I thought we could have something like this.
> > -----------------------------------------------------------------------------
> > Gather (cost=1000.00..108738.90 rows=0 width=8)
> > Workers Planned: 2 Parallel Insert on t_test1
> > -> Parallel Seq Scan on t_test (cost=0.00..106748.00 rows=4954 width=8)
> > Filter: (many < 10000)
> > -----------------------------------------------------------------------------
>
> maybe something like below:
> Gather (cost=1000.00..108738.90 rows=0 width=8)
> -> Create t_test1
> -> Parallel Seq Scan on t_test
>
> I don't know what is the best thing to do here. I think for the
> temporary purpose you can keep something like above then once the
> patch is matured then we can take a separate opinion for this.
>

Agreed. Here's a snapshot of explain with the change suggested.

postgres=# EXPLAIN (ANALYZE, COSTS OFF) CREATE TABLE t1_test AS SELECT * FROM t1;
QUERY PLAN
---------------------------------------------------------------------------------
Gather (actual time=970.524..972.913 rows=0 loops=1)
-> Create t1_test
Workers Planned: 2
Workers Launched: 2
-> Parallel Seq Scan on t1 (actual time=0.028..86.623 rows=333333 loops=3)
Planning Time: 0.049 ms
Execution Time: 973.733 ms

>
> I think there is no reason why one can't use ORDER BY in the
> statements we are talking about here. But, I think we can't enable
> parallelism for GatherMerge is because for that node we always need to
> fetch the data in the leader backend to perform the final merge phase.
> So, I was expecting a small comment saying something on those lines.
>

Added comments.

> 2. Addition of new test cases.

Added new test cases.

> Analysis on the 2 mismatches in write_parallel.sql regression test.

Done. It needed a small code change in costsize.c. Now, both make check and make check-world passes.

Apart from above, a couple of other things I have finished with the v3 patch.

1. Both make check and make check-world with force_parallel_mode = regress passes.

2. I enabled parallel inserts in case of materialized views. Hope that's fine.

Attaching v3 patch herewith.

I'm done with all the open points in my list. Please review the v3 patch and provide comments.

With Regards,

Bharath Rupireddy.

EnterpriseDB: http://www.enterprisedb.com

Вложения

v3-0001-Parallel-Inserts-in-CREATE-TABLE-AS.patch

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallel Inserts in CREATE TABLE AS

Вложения