Re: Single transaction in the tablesync worker?

Поиск
Список
Период
Сортировка
От Craig Ringer
Тема Re: Single transaction in the tablesync worker?
Дата
Msg-id CAGRY4nxnZhOm_QwUHdUsJKq80QwwncdtGR7EEYs1mUm-L8+MtQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Single transaction in the tablesync worker?  (Peter Smith <smithpb2250@gmail.com>)
Re: Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers


On Sat, 5 Dec 2020, 10:03 Amit Kapila, <amit.kapila16@gmail.com> wrote:
On Fri, Dec 4, 2020 at 7:12 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Thu, Dec 3, 2020 at 7:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Dec 3, 2020 at 7:04 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Thu, Dec 3, 2020 at 2:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > The tablesync worker in logical replication performs the table data
> > > > sync in a single transaction which means it will copy the initial data
> > > > and then catch up with apply worker in the same transaction. There is
> > > > a comment in LogicalRepSyncTableStart ("We want to do the table data
> > > > sync in a single transaction.") saying so but I can't find the
> > > > concrete theory behind the same. Is there any fundamental problem if
> > > > we commit the transaction after initial copy and slot creation in
> > > > LogicalRepSyncTableStart and then allow the apply of transactions as
> > > > it happens in apply worker? I have tried doing so in the attached (a
> > > > quick prototype to test) and didn't find any problems with regression
> > > > tests. I have tried a few manual tests as well to see if it works and
> > > > didn't find any problem. Now, it is quite possible that it is
> > > > mandatory to do the way we are doing currently, or maybe something
> > > > else is required to remove this requirement but I think we can do
> > > > better with respect to comments in this area.
> > >
> > > If we commit the initial copy, the data upto the initial copy's
> > > snapshot will be visible downstream. If we apply the changes by
> > > committing changes per transaction, the data visible to the other
> > > transactions will differ as the apply progresses.
> > >
> >
> > It is not clear what you mean by the above.  The way you have written
> > appears that you are saying that instead of copying the initial data,
> > I am saying to copy it transaction-by-transaction. But that is not the
> > case. I am saying copy the initial data by using REPEATABLE READ
> > isolation level as we are doing now, commit it and then process
> > transaction-by-transaction till we reach sync-point (point till where
> > apply worker has already received the data).
>
> Craig in his mail has clarified this. The changes after the initial
> COPY will be visible before the table sync catches up.
>

I think the problem is not that the changes are visible after COPY
rather it is that we don't have a mechanism to restart if it crashes
after COPY unless we do all the sync up in one transaction. Assume we
commit after COPY and then process transaction-by-transaction and it
errors out (due to connection loss) or crashes, in-between one of the
following transactions after COPY then after the restart we won't know
from where to start for that relation. This is because the catalog
(pg_subscription_rel) will show the state as 'd' (data is being
copied) and the slot would have gone as it was a temporary slot. But
as mentioned in one of my emails above [1] we can solve these problems
which Craig also seems to be advocating for as there are many
advantages of not doing the entire sync (initial copy + stream changes
for that relation) in one single transaction. It will allow us to
support decode of prepared xacts in the subscriber. Also, it seems
pglogical already does processing transaction-by-transaction after the
initial copy. The only thing which is not clear to me is why we
haven't decided to go ahead initially and it would be probably better
if the original authors would also chime-in to at least clarify the
same.

It's partly a resource management issue.

Replication origins are a limited resource. We need to use a replication origin for any sync we want to be durable across restarts.

Then again so are slots and we use temp slots for each sync.

If a sync fails cleanup on the upstream side is simple with a temp slot. With persistent slots we have more risk of creating upstream issues. But then, so long as the subscriber exists it can deal with that. And if the subscriber no longer exists its primary slot is an issue too.

It'd help if we could register pg_shdepend entries between catalog entries and slots, and from a main subscription slot to any extra slots used for resynchronization.

And I should write a patch for a resource retention summarisation view.


I am not sure why but it seems acceptable to original authors that the
data of transactions are visibly partially during the initial
synchronization phase for a subscription.

I don't think there's much alternative there.

Pg would need some kind of cross commit visibility control mechanism that separates durable commit from visibility

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Parallel Inserts in CREATE TABLE AS
Следующее
От: David Rowley
Дата:
Сообщение: Re: Hybrid Hash/Nested Loop joins and caching results from subplans