Re: Parallel copy

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: Parallel copy
Дата
Msg-id CAFiTN-sSN6ZM+2LKo5imaxhosPu461u9v9ZcTTq1AiLqRvrWTw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel copy  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Thu, May 14, 2020 at 11:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 14, 2020 at 12:39 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Tue, May 12, 2020 at 1:01 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > I don't understand why we need to do something special for combo CIDs
> > > if they are not generated during this operation?
> >
> > Hmm. Well I guess if they're not being generated then we don't need to
> > do anything about them, but I still think we should try to work around
> > having to disable parallelism for a table which is referenced by
> > foreign keys.
> >
>
> Okay, just to be clear, we want to allow parallelism for a table that
> has foreign keys.  Basically, a parallel copy should work while
> loading data into tables having FK references.
>
> To support that, we need to consider a few things.
> a. Currently, we increment the command counter each time we take a key
> share lock on a tuple during trigger execution.  I am really not sure
> if this is required during Copy command execution or we can just
> increment it once for the copy.   If we need to increment the command
> counter just once for copy command then for the parallel copy we can
> ensure that we do it just once at the end of the parallel copy but if
> not then we might need some special handling.
>
> b.  Another point is that after inserting rows we record CTIDs of the
> tuples in the event queue and then once all tuples are processed we
> call FK trigger for each CTID.  Now, with parallelism, the FK checks
> will be processed once the worker processed one chunk.  I don't see
> any problem with it but still, this will be a bit different from what
> we do in serial case.  Do you see any problem with this?

IMHO, it should not be a problem because without parallelism also we
trigger the foreign key check when we detect EOF and end of data from
STDIN.  And, with parallel workers also the worker will assume that it
has complete all the work and it can go for the foreign key check is
only after the leader receives EOF and end of data from STDIN.

The only difference is that each worker is not waiting for all the
data (from all workers) to get inserted before checking the
constraint.  Moreover, we are not supporting external triggers with
the parallel copy, otherwise, we might have to worry that those
triggers could do something on the primary table before we check the
constraint.  I am not sure if there are any other factors that I am
missing.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Andrey M. Borodin"
Дата:
Сообщение: Re: MultiXact\SLRU buffers configuration
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Incorrect OpenSSL type reference in code comment