Re: Single transaction in the tablesync worker?

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Single transaction in the tablesync worker?
Дата
Msg-id CAA4eK1JYoxoa=GdQ73G1Ohz+b2jfAECFFAHtvkj-+qPJKUsNgA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Single transaction in the tablesync worker?  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
On Mon, Dec 7, 2020 at 9:21 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Dec 7, 2020 at 6:20 AM Craig Ringer
> <craig.ringer@enterprisedb.com> wrote:
> >
>
> >>
> >> I am not sure why but it seems acceptable to original authors that the
> >> data of transactions are visibly partially during the initial
> >> synchronization phase for a subscription.
> >
> >
> > I don't think there's much alternative there.
> >
>
> I am not sure about this. I think it is primarily to allow some more
> parallelism among apply and sync workers. One primitive way to achieve
> parallelism and don't have this problem is to allow apply worker to
> wait till all the tablesync workers are in DONE state.
>

As the slot of apply worker is created before all the tablesync
workers it should never miss any LSN which tablesync workers would
have processed. Also, the table sync workers should not process any
xact if the apply worker has not processed anything. I think tablesync
currently always processes one transaction (because we call
process_sync_tables at commit of a txn) even if that is not required
to be in sync with the apply worker. This should solve both the
problems (a) visibility of partial transactions (b) allow prepared
transactions because tablesync worker no longer needs to combine
multiple transactions data.

I think the other advantages of this would be that it would reduce the
load (both CPU and I/O) on the publisher-side by allowing to decode
the data only once instead of for each table sync worker once and
separately for the apply worker. I think it will use fewer resources
to finish the work.

Is there any flaw in this idea which I am missing?

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: Parallel copy
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: Logical archiving