Re: Parallel Inserts in CREATE TABLE AS

Поиск
Список
Период
Сортировка
От Bharath Rupireddy
Тема Re: Parallel Inserts in CREATE TABLE AS
Дата
Msg-id CALj2ACW+5RK+rLrcH_V1KQmkkaiKECEodr8o9Fp6NF8z+3282A@mail.gmail.com
обсуждение исходный текст
Ответ на RE: Parallel Inserts in CREATE TABLE AS  ("Hou, Zhijie" <houzj.fnst@cn.fujitsu.com>)
Ответы RE: Parallel Inserts in CREATE TABLE AS  ("Hou, Zhijie" <houzj.fnst@cn.fujitsu.com>)
Список pgsql-hackers
On Wed, Jan 6, 2021 at 11:30 AM Hou, Zhijie <houzj.fnst@cn.fujitsu.com> wrote:
>
> > > I think it makes sense.
> > >
> > > And if the check about ' ins_cmd == xxx1 || ins_cmd == xxx2' may be
> > > used in some places, How about define a generic function with some comment
> > to mention the purpose.
> > >
> > > An example in INSERT INTO SELECT patch:
> > > +/*
> > > + * IsModifySupportedInParallelMode
> > > + *
> > > + * Indicates whether execution of the specified table-modification
> > > +command
> > > + * (INSERT/UPDATE/DELETE) in parallel-mode is supported, subject to
> > > +certain
> > > + * parallel-safety conditions.
> > > + */
> > > +static inline bool
> > > +IsModifySupportedInParallelMode(CmdType commandType) {
> > > +       /* Currently only INSERT is supported */
> > > +       return (commandType == CMD_INSERT); }
> >
> > The intention of assert is to verify that those functions are called for
> > appropriate commands such as CTAS, Refresh Mat View and so on with correct
> > parameters. I really don't think so we can replace the assert with a function
> > like above, in the release mode assertion will always be true. In a way,
> > that assertion is for only debugging purposes. And I also think that when
> > we as the callers know when to call those new functions, we can even remove
> > the assertions, if they are really a problem here. Thoughts?
> Hi
>
> Thanks for the explanation.
>
> If the check about command type is only used in assert, I think you are right.
> I suggested a new function because I guess the check can be used in some other places.
> Such as:
>
> +               /* Okay to parallelize inserts, so mark it. */
> +               if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> +                       ((DR_intorel *) dest)->is_parallel = true;
>
> +               if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> +                       ((DR_intorel *) dest)->is_parallel = false;

We need to know exactly what is the command in above place, to
dereference and mark is_parallel to true, because is_parallel is being
added to the respective structures, not to the generic _DestReceiver
structure. So, in future the above code becomes something like below:

+    /* Okay to parallelize inserts, so mark it. */
+    if (ins_cmd == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
+        ((DR_intorel *) dest)->is_parallel = true;
+    else if (ins_cmd == PARALLEL_INSERT_CMD_REFRESH_MAT_VIEW)
+        ((DR_transientrel *) dest)->is_parallel = true;
+    else if (ins_cmd == PARALLEL_INSERT_CMD_COPY_TO)
+        ((DR_copy *) dest)->is_parallel = true;

In the below place, instead of new function, I think we can just have
something like if (fpes->ins_cmd_type != PARALLEL_INSERT_CMD_UNDEF)

> Or
>
> +       if (fpes->ins_cmd_type == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
> +               pg_atomic_add_fetch_u64(&fpes->processed, queryDesc->estate->es_processed);
>
> If you think the above code will extend the ins_cmd type check in the future, the generic function may make sense.

We can also change below to fpes->ins_cmd_type != PARALLEL_INSERT_CMD_UNDEF.

+    if (fpes->ins_cmd_type == PARALLEL_INSERT_CMD_CREATE_TABLE_AS)
+        receiver = ExecParallelGetInsReceiver(toc, fpes);

If okay, I will modify it in the next version of the patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От:
Дата:
Сообщение: RE: [PATCH] Feature improvement for CLOSE, FETCH, MOVE tab completion
Следующее
От: "tsunakawa.takay@fujitsu.com"
Дата:
Сообщение: When (and whether) should we improve the chapter on parallel query to accommodate parallel data updates?