Re: Fast COPY FROM based on batch insert

Поиск
Список
Период
Сортировка
От Andrey Lepikhov
Тема Re: Fast COPY FROM based on batch insert
Дата
Msg-id 57733596-5fd6-07ec-d65c-67411411f10e@postgrespro.ru
обсуждение исходный текст
Ответ на Re: Fast COPY FROM based on batch insert  (Etsuro Fujita <etsuro.fujita@gmail.com>)
Ответы Re: Fast COPY FROM based on batch insert  (Etsuro Fujita <etsuro.fujita@gmail.com>)
Список pgsql-hackers
On 18/7/2022 13:22, Etsuro Fujita wrote:
> On Thu, Mar 24, 2022 at 3:43 PM Andrey V. Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
>> On 3/22/22 06:54, Etsuro Fujita wrote:
>>> * To allow foreign multi insert, the patch made an invasive change to
>>> the existing logic to determine whether to use multi insert for the
>>> target relation, adding a new member ri_usesMultiInsert to the
>>> ResultRelInfo struct, as well as introducing a new function
>>> ExecMultiInsertAllowed().  But I’m not sure we really need such a
>>> change.  Isn’t it reasonable to *adjust* the existing logic to allow
>>> foreign multi insert when possible?
>> Of course, such approach would look much better, if we implemented it.
> 
>> I'll ponder how to do it.
> 
> I rewrote the decision logic to something much simpler and much less
> invasive, which reduces the patch size significantly.  Attached is an
> updated patch.  What do you think about that?
> 
> While working on the patch, I fixed a few issues as well:
> 
> +       if (resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize != NULL)
> +           resultRelInfo->ri_BatchSize =
> +
> resultRelInfo->ri_FdwRoutine->GetForeignModifyBatchSize(resultRelInfo);
> 
> When determining the batch size, I think we should check if the
> ExecForeignBatchInsert callback routine is also defined, like other
> places such as execPartition.c.  For consistency I fixed this by
> copying-and-pasting the code from that file.
> 
> +    * Also, the COPY command requires a non-zero input list of attributes.
> +    * Therefore, the length of the attribute list is checked here.
> +    */
> +   if (!cstate->volatile_defexprs &&
> +       list_length(cstate->attnumlist) > 0 &&
> +       !contain_volatile_functions(cstate->whereClause))
> +       target_resultRelInfo->ri_usesMultiInsert =
> +                   ExecMultiInsertAllowed(target_resultRelInfo);
> 
> I think “list_length(cstate->attnumlist) > 0” in the if-test would
> break COPY FROM; it currently supports multi-inserting into *plain*
> tables even in the case where they have no columns, but this would
> disable the multi-insertion support in that case.  postgres_fdw would
> not be able to batch into zero-column foreign tables due to the INSERT
> syntax limitation (i.e., the syntax does not allow inserting multiple
> empty rows into a zero-column table in a single INSERT statement).
> Which is the reason why this was added to the if-test?  But I think
> some other FDWs might be able to, so I think we should let the FDW
> decide whether to allow batching even in that case, when called from
> GetForeignModifyBatchSize.  So I removed the attnumlist test from the
> patch, and modified postgresGetForeignModifyBatchSize as such.  I
> might miss something, though.
Thanks a lot,
maybe you forgot this code:
/*
  * If a partition's root parent isn't allowed to use it, neither is the
  * partition.
*/
if (rootResultRelInfo->ri_usesMultiInsert)
    leaf_part_rri->ri_usesMultiInsert =
                ExecMultiInsertAllowed(leaf_part_rri);

Also, maybe to describe in documentation, if the value of batch_size is 
more than 1, the ExecForeignBatchInsert routine have a chance to be called?

-- 
regards,
Andrey Lepikhov
Postgres Professional



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Use "WAL segment" instead of "log segment" consistently in user-facing messages
Следующее
От: Aleksander Alekseev
Дата:
Сообщение: Re: [PATCH] Introduce array_shuffle() and array_sample()