Re: Parallel INSERT (INTO ... SELECT ...)

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel INSERT (INTO ... SELECT ...)
Дата
Msg-id CAA4eK1Lk37E2nrJb9WGyaAjA3FeM+hLM1kE0Jyfb49HeKqoWtw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel INSERT (INTO ... SELECT ...)  (Greg Nancarrow <gregn4422@gmail.com>)
Ответы Re: Parallel INSERT (INTO ... SELECT ...)  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Thu, Oct 15, 2020 at 9:56 AM Greg Nancarrow <gregn4422@gmail.com> wrote:
>
> On Fri, Oct 9, 2020 at 8:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > It might be a good idea to first just get this patch committed, if
> > possible. So, I have reviewed the latest version of this patch:
> >
> > 0001-InsertParallelSelect
>
> I've attached an updated InsertParallelSelect patch (v2) - allowing
> underlying parallel SELECT for "INSERT INTO ... SELECT ...".
> I think I've addressed most of the issues identified in the previous
> version of the patch.
> I'm still seeing a couple of errors in the tests when
> "force_parallel_mode=regress" is in effect, and those need to be
> addressed (extra checks required to avoid parallel SELECT in certain
> cases).
>

I am getting below error in force_parallel_mode:
@@ -1087,9 +1087,14 @@
 ERROR:  value for domain inotnull violates check constraint "inotnull_check"
 create table dom_table (x inotnull);
 insert into dom_table values ('1');
+ERROR:  cannot start commands during a parallel operation
+CONTEXT:  SQL function "sql_is_distinct_from"

It happened with below test:

create function sql_is_distinct_from(anyelement, anyelement)
returns boolean language sql
as 'select $1 is distinct from $2 limit 1';

create domain inotnull int
  check (sql_is_distinct_from(value, null));

create table dom_table (x inotnull);
insert into dom_table values ('1');

So it is clear that this is happening because we have allowed insert
that is parallel-unsafe. The attribute is of type domain which has a
parallel-unsafe constraint. As per your current code, we need to
detect it in IsRelParallelModeSafeForModify. The idea would be to
check the type of each attribute and if it is domain type then we need
to check if it has a constraint (See function ExecGrant_Type on how to
detect a domain type and then refer to functions
AlterTypeNamespaceInternal and AlterConstraintNamespaces to know how
to determine constraint for domain type). Once you can find a
constraint then you already have code in your patch to find if it
contains parallel-unsafe expression.


> Also, I'm seeing a partition-related error when running
> installcheck-world - I'm investigating that.
>

Okay.

Few more comments:
==================
1.
+ /*
+ * Can't support execution of row-level or transition-table triggers
+ * during parallel-mode, since such triggers may query the table
+ * into which the data is being inserted, and the content returned
+ * would vary unpredictably according to the order of retrieval by
+ * the workers and the rows already inserted.
+ */
+ if (trigdesc != NULL &&
+ (trigdesc->trig_insert_instead_row ||
+   trigdesc->trig_insert_before_row ||
+   trigdesc->trig_insert_after_row ||
+   trigdesc->trig_insert_new_table))
+ {
+ return false;
+ }

I don't think it is a good idea to prohibit all before/after/instead
row triggers because for the case you are referring to should mark
trigger functions as parallel-unsafe. We might want to have to Assert
somewhere to detect if there is illegal usage but I don't see the need
to directly prohibit them.

2.
@@ -56,8 +57,8 @@ GetNewTransactionId(bool isSubXact)
  * Workers synchronize transaction state at the beginning of each parallel
  * operation, so we can't account for new XIDs after that point.
  */
- if (IsInParallelMode())
- elog(ERROR, "cannot assign TransactionIds during a parallel operation");
+ if (IsParallelWorker())
+ elog(ERROR, "cannot assign TransactionIds in a parallel worker");

  /*
  * During bootstrap initialization, we return the special bootstrap
diff --git a/src/backend/access/transam/xact.c
b/src/backend/access/transam/xact.c
index af6afce..ef423fb 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -580,8 +580,8 @@ AssignTransactionId(TransactionState s)
  * Workers synchronize transaction state at the beginning of each parallel
  * operation, so we can't account for new XIDs at this point.
  */
- if (IsInParallelMode() || IsParallelWorker())
- elog(ERROR, "cannot assign XIDs during a parallel operation");
+ if (IsParallelWorker())
+ elog(ERROR, "cannot assign XIDs in a parallel worker");


I think we don't need these changes at least for the purpose of this
patch if you follow the suggestion related to having a new function
like PrepareParallelMode in the email above [1]. One problem I see
with removing these checks is how do we ensure that leader won't
assign a new transactionid once we start executing a parallel node. It
can do via sub-transactions maybe that is already protected at some
previous point but I would like to see if we can retain these checks.

[1] - https://www.postgresql.org/message-id/CAA4eK1JogfXUa%3D3wMPO%2BK%3DUiOLgHgCO7-fj1wCHsSxdaXsfVbw%40mail.gmail.com

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: New statistics for tuning WAL buffer size
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: pgsql: Restore replication protocol's duplicate command tags