Re: Perform streaming logical transactions by background workers and parallel apply

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Perform streaming logical transactions by background workers and parallel apply
Дата
Msg-id CAA4eK1+4qAc8vbma3obtzrzJOTx_w-DJvBzxY9JuUG_uCP9OiQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Perform streaming logical transactions by background workers and parallel apply  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Perform streaming logical transactions by background workers and parallel apply  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Mon, May 2, 2022 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Apr 8, 2022 at 6:14 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Wednesday, April 6, 2022 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > In this email, I would like to discuss allowing streaming logical
> > > transactions (large in-progress transactions) by background workers
> > > and parallel apply in general. The goal of this work is to improve the
> > > performance of the apply work in logical replication.
> > >
> > > Currently, for large transactions, the publisher sends the data in
> > > multiple streams (changes divided into chunks depending upon
> > > logical_decoding_work_mem), and then on the subscriber-side, the apply
> > > worker writes the changes into temporary files and once it receives
> > > the commit, it read from the file and apply the entire transaction. To
> > > improve the performance of such transactions, we can instead allow
> > > them to be applied via background workers. There could be multiple
> > > ways to achieve this:
> > >
> > > Approach-1: Assign a new bgworker (if available) as soon as the xact's
> > > first stream came and the main apply worker will send changes to this
> > > new worker via shared memory. We keep this worker assigned till the
> > > transaction commit came and also wait for the worker to finish at
> > > commit. This preserves commit ordering and avoid writing to and
> > > reading from file in most cases. We still need to spill if there is no
> > > worker available. We also need to allow stream_stop to complete by the
> > > background worker to finish it to avoid deadlocks because T-1's
> > > current stream of changes can update rows in conflicting order with
> > > T-2's next stream of changes.
> > >
> >
> > Attach the POC patch for the Approach-1 of "Perform streaming logical
> > transactions by background workers". The patch is still a WIP patch as
> > there are serval TODO items left, including:
> >
> > * error handling for bgworker
> > * support for SKIP the transaction in bgworker
> > * handle the case when there is no more worker available
> >   (might need spill the data to the temp file in this case)
> > * some potential bugs
>
> Are you planning to support "Transaction dependency" Amit mentioned in
> his first mail in this patch? IIUC since the background apply worker
> applies the streamed changes as soon as receiving them from the main
> apply worker, a conflict that doesn't happen in the current streaming
> logical replication could happen.
>

This patch seems to be waiting for stream_stop to finish, so I don't
see how the issues related to "Transaction dependency" can arise? What
type of conflict/issues you have in mind?


-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: strange slow query - lost lot of time somewhere
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: bogus: logical replication rows/cols combinations