Re: Fast COPY FROM based on batch insert

Поиск
Список
Период
Сортировка
От Etsuro Fujita
Тема Re: Fast COPY FROM based on batch insert
Дата
Msg-id CAPmGK14ov-MbQtwHuSK5ESUeL1csgCVC1MAAwZxgV=NWsBE2ZQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Fast COPY FROM based on batch insert  ("Andrey V. Lepikhov" <a.lepikhov@postgrespro.ru>)
Ответы Re: Fast COPY FROM based on batch insert  (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
Список pgsql-hackers
On Thu, Mar 24, 2022 at 3:43 PM Andrey V. Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 3/22/22 06:54, Etsuro Fujita wrote:
> > * To allow foreign multi insert, the patch made an invasive change to
> > the existing logic to determine whether to use multi insert for the
> > target relation, adding a new member ri_usesMultiInsert to the
> > ResultRelInfo struct, as well as introducing a new function
> > ExecMultiInsertAllowed().  But I’m not sure we really need such a
> > change.  Isn’t it reasonable to *adjust* the existing logic to allow
> > foreign multi insert when possible?
> Of course, such approach would look much better, if we implemented it.

> I'll ponder how to do it.

I rewrote the decision logic to something much simpler and much less
invasive, which reduces the patch size significantly.  Attached is an
updated patch.  What do you think about that?

While working on the patch, I fixed a few issues as well:

+       if (resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize != NULL)
+           resultRelInfo->ri_BatchSize =
+
resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize(resultRelInfo);

When determining the batch size, I think we should check if the
ExecForeignBatchInsert callback routine is also defined, like other
places such as execPartition.c.  For consistency I fixed this by
copying-and-pasting the code from that file.

+    * Also, the COPY command requires a non-zero input list of attributes.
+    * Therefore, the length of the attribute list is checked here.
+    */
+   if (!cstate->volatile_defexprs &&
+       list_length(cstate->attnumlist) > 0 &&
+       !contain_volatile_functions(cstate->whereClause))
+       target_resultRelInfo->ri_usesMultiInsert =
+                   ExecMultiInsertAllowed(target_resultRelInfo);

I think “list_length(cstate->attnumlist) > 0” in the if-test would
break COPY FROM; it currently supports multi-inserting into *plain*
tables even in the case where they have no columns, but this would
disable the multi-insertion support in that case.  postgres_fdw would
not be able to batch into zero-column foreign tables due to the INSERT
syntax limitation (i.e., the syntax does not allow inserting multiple
empty rows into a zero-column table in a single INSERT statement).
Which is the reason why this was added to the if-test?  But I think
some other FDWs might be able to, so I think we should let the FDW
decide whether to allow batching even in that case, when called from
GetForeignModifyBatchSize.  So I removed the attnumlist test from the
patch, and modified postgresGetForeignModifyBatchSize as such.  I
might miss something, though.

Best regards,
Etsuro Fujita

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: John Naylor
Дата:
Сообщение: Re: Proposal to introduce a shuffle function to intarray extension
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher