Re: Parallel Inserts in CREATE TABLE AS

Поиск
Список
Период
Сортировка
От vignesh C
Тема Re: Parallel Inserts in CREATE TABLE AS
Дата
Msg-id CALDaNm1y7rYZi9+-M738Ch=BHhM+C=V4saeXyxctO+NLdWdALQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Ответы Re: Parallel Inserts in CREATE TABLE AS  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
On Thu, Dec 24, 2020 at 1:07 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Thu, Dec 24, 2020 at 10:25 AM vignesh C <vignesh21@gmail.com> wrote:
> > You could change intoclause_len = strlen(intoclausestr) to
> > strlen(intoclausestr) + 1 and use intoclause_len in the remaining
> > places. We can avoid the +1 in the other places.
> > +       /* Estimate space for into clause for CTAS. */
> > +       if (IS_CTAS(intoclause) && OidIsValid(objectid))
> > +       {
> > +               intoclausestr = nodeToString(intoclause);
> > +               intoclause_len = strlen(intoclausestr);
> > +               shm_toc_estimate_chunk(&pcxt->estimator, intoclause_len + 1);
> > +               shm_toc_estimate_keys(&pcxt->estimator, 1);
> > +       }
>
> Done.
>
> > Can we use  node->nworkers_launched == 0 in place of
> > node->need_to_scan_locally, that way the setting and resetting of
> > node->need_to_scan_locally can be removed. Unless need_to_scan_locally
> > is needed in any of the functions that gets called.
> > +       /* Enable leader to insert in case no parallel workers were launched. */
> > +       if (node->nworkers_launched == 0)
> > +               node->need_to_scan_locally = true;
> > +
> > +       /*
> > +        * By now, for parallel workers (if launched any), would have
> > started their
> > +        * work i.e. insertion to target table. In case the leader is chosen to
> > +        * participate for parallel inserts in CTAS, then finish its
> > share before
> > +        * going to wait for the parallel workers to finish.
> > +        */
> > +       if (node->need_to_scan_locally)
> > +       {
>
> need_to_scan_locally is being set in ExecGather() even if
> nworkers_launched > 0 it can still be true, so I think we can not
> remove need_to_scan_locally in ExecParallelInsertInCTAS.
>
> Attaching v15 patch set for further review. Note that the change is
> only in 0001 patch, other patches remain unchanged from v14.
>

+-- parallel inserts must occur
+select explain_pictas(
+'create table parallel_write as select length(stringu1) from tenk1;');
+select count(*) from parallel_write;
+drop table parallel_write;

We can change comment  "parallel inserts must occur" like "parallel
insert must be selected for CTAS on normal table"

+-- parallel inserts must occur
+select explain_pictas(
+'create unlogged table parallel_write as select length(stringu1) from tenk1;');
+select count(*) from parallel_write;
+drop table parallel_write;

We can change comment "parallel inserts must occur" like "parallel
insert must be selected for CTAS on unlogged table"
Similar comment need to be handled in other places also.

+create function explain_pictas(text) returns setof text
+language plpgsql as
+$$
+declare
+    ln text;
+begin
+    for ln in
+        execute format('explain (analyze, costs off, summary off,
timing off) %s',
+            $1)
+    loop
+        ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers
Launched: N');
+        ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual
rows=N loops=N');
+        ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows
Removed by Filter: N');
+        return next ln;
+    end loop;
+end;
+$$;

The above function is same as function present in partition_prune.sql:
create function explain_parallel_append(text) returns setof text
language plpgsql as
$$
declare
    ln text;
begin
    for ln in
        execute format('explain (analyze, costs off, summary off,
timing off) %s',
            $1)
    loop
        ln := regexp_replace(ln, 'Workers Launched: \d+', 'Workers
Launched: N');
        ln := regexp_replace(ln, 'actual rows=\d+ loops=\d+', 'actual
rows=N loops=N');
        ln := regexp_replace(ln, 'Rows Removed by Filter: \d+', 'Rows
Removed by Filter: N');
        return next ln;
    end loop;
end;
$$;

If possible try to make a common function for both and use.

+       if (intoclausestr && OidIsValid(objectid))
+               fpes->objectid = objectid;
+       else
+               fpes->objectid = InvalidOid;
Here OidIsValid(objectid) check is not required intoclausestr will be
set only if OidIsValid.

Regards,
Vignesh
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: Parallel copy
Следующее
От: Zhihong Yu
Дата:
Сообщение: Re: pg_stat_statements and "IN" conditions