Re: [POC] Fast COPY FROM command for the table with foreign partitions

Поиск
Список
Период
Сортировка
От Amit Langote
Тема Re: [POC] Fast COPY FROM command for the table with foreign partitions
Дата
Msg-id CA+HiwqHRqG8X9W6-v-E6d0nw1k6aC7QSMk-SOvv4puOjWsQMmA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [POC] Fast COPY FROM command for the table with foreign partitions  ("Andrey V. Lepikhov" <a.lepikhov@postgrespro.ru>)
Ответы Re: [POC] Fast COPY FROM command for the table with foreign partitions  ("Andrey V. Lepikhov" <a.lepikhov@postgrespro.ru>)
Список pgsql-hackers
Hi Andrey,

Thanks for updating the patch.  I will try to take a look later.

On Wed, Jul 22, 2020 at 6:09 PM Andrey V. Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> On 7/16/20 2:14 PM, Amit Langote wrote:
> > * Why the "In" in these API names?
> >
> > +   /* COPY a bulk of tuples into a foreign relation */
> > +   BeginForeignCopyIn_function BeginForeignCopyIn;
> > +   EndForeignCopyIn_function EndForeignCopyIn;
> > +   ExecForeignCopyIn_function ExecForeignCopyIn;
>
> I used an analogy from copy.c.

Hmm, if we were going to also need *ForeignCopyOut APIs, maybe it
makes sense to have "In" here, but maybe we don't, so how about
leaving out the "In" for clarity?

> > * I see that the remote copy is performed from scratch on every call
> > of postgresExecForeignCopyIn(), but wouldn't it be more efficient to
> > send the `COPY remote_table FROM STDIN` in
> > postgresBeginForeignCopyIn() and end it in postgresEndForeignCopyIn()
> > when there are no errors during the copy?
>
> It is not possible. FDW share one connection between all foreign
> relations from a server. If two or more partitions will be placed at one
> foreign server you will have problems with concurrent COPY command.

Ah, you're right.  I didn't consider multiple foreign partitions
pointing to the same server.  Indeed, we would need separate
connections to a given server to COPY to multiple remote relations on
that server in parallel.

> May be we can create new connection for each partition?

Yeah, perhaps, although it sounds like something that might be more
generally useful and so we should work on that separately if at all.

> > I tried implementing these two changes -- pgfdw_copy_data_dest_cb()
> > and sending `COPY remote_table FROM STDIN` only once instead of on
> > every flush -- and I see significant speedup.  Please check the
> > attached patch that applies on top of yours.
>
> I integrated first change and rejected the second by the reason as above.

Thanks.

Will send more comments after reading the v5 patch.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Konstantin Knizhnik
Дата:
Сообщение: Re: [Patch] Optimize dropping of relation buffers using dlist
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?