Re: Single transaction in the tablesync worker?

Поиск
Список
Период
Сортировка
От Ashutosh Bapat
Тема Re: Single transaction in the tablesync worker?
Дата
Msg-id CAExHW5uXKDVH9Y1p35PmOs6y-WK-xU82Enr-96OPxnVUkBOhDA@mail.gmail.com
обсуждение исходный текст
Ответ на Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Single transaction in the tablesync worker?  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Thu, Dec 3, 2020 at 2:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> The tablesync worker in logical replication performs the table data
> sync in a single transaction which means it will copy the initial data
> and then catch up with apply worker in the same transaction. There is
> a comment in LogicalRepSyncTableStart ("We want to do the table data
> sync in a single transaction.") saying so but I can't find the
> concrete theory behind the same. Is there any fundamental problem if
> we commit the transaction after initial copy and slot creation in
> LogicalRepSyncTableStart and then allow the apply of transactions as
> it happens in apply worker? I have tried doing so in the attached (a
> quick prototype to test) and didn't find any problems with regression
> tests. I have tried a few manual tests as well to see if it works and
> didn't find any problem. Now, it is quite possible that it is
> mandatory to do the way we are doing currently, or maybe something
> else is required to remove this requirement but I think we can do
> better with respect to comments in this area.

If we commit the initial copy, the data upto the initial copy's
snapshot will be visible downstream. If we apply the changes by
committing changes per transaction, the data visible to the other
transactions will differ as the apply progresses. You haven't
clarified whether we will respect the transaction boundaries in the
apply log or not. I assume we will. Whereas if we apply all the
changes in one go, other transactions either see the data before
resync or after it without any intermediate states. That will not
violate consistency, I think.

That's all I can think of as the reason behind doing a whole resync as
a single transaction.

-- 
Best Wishes,
Ashutosh Bapat



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Dunstan
Дата:
Сообщение: Re: Commitfest 2020-11 is closed
Следующее
От: Craig Ringer
Дата:
Сообщение: Re: pg_ctl.exe file deleted automatically