Re: Perform streaming logical transactions by background workers and parallel apply

Поиск
Список
Период
Сортировка
От Peter Smith
Тема Re: Perform streaming logical transactions by background workers and parallel apply
Дата
Msg-id CAHut+PvPwTQ_GpVMw3JgZzhWcwTrAiT-_5M-XzEO7GUnHH8hGA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Perform streaming logical transactions by background workers and parallel apply  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Perform streaming logical transactions by background workers and parallel apply  (Peter Smith <smithpb2250@gmail.com>)
Список pgsql-hackers
On Tue, May 3, 2022 at 2:15 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, May 2, 2022 at 5:06 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, May 2, 2022 at 6:09 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Mon, May 2, 2022 at 11:47 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > >
> > > > Are you planning to support "Transaction dependency" Amit mentioned in
> > > > his first mail in this patch? IIUC since the background apply worker
> > > > applies the streamed changes as soon as receiving them from the main
> > > > apply worker, a conflict that doesn't happen in the current streaming
> > > > logical replication could happen.
> > > >
> > >
> > > This patch seems to be waiting for stream_stop to finish, so I don't
> > > see how the issues related to "Transaction dependency" can arise? What
> > > type of conflict/issues you have in mind?
> >
> > Suppose we set both publisher and subscriber:
> >
> > On publisher:
> > create table test (i int);
> > insert into test values (0);
> > create publication test_pub for table test;
> >
> > On subscriber:
> > create table test (i int primary key);
> > create subscription test_sub connection '...' publication test_pub; --
> > value 0 is replicated via initial sync
> >
> > Now, both 'test' tables have value 0.
> >
> > And suppose two concurrent transactions are executed on the publisher
> > in following order:
> >
> > TX-1:
> > begin;
> > insert into test select generate_series(0, 10000); -- changes will be streamed;
> >
> >     TX-2:
> >     begin;
> >     delete from test where c = 0;
> >     commit;
> >
> > TX-1:
> > commit;
> >
> > With the current streaming logical replication, these changes will be
> > applied successfully since the deletion is applied before the
> > (streamed) insertion. Whereas with the apply bgworker, it fails due to
> > an unique constraint violation since the insertion is applied first.
> > I've confirmed that it happens with v5 patch.
> >
>
> Good point but I am not completely sure if doing transaction
> dependency tracking for such cases is really worth it. I feel for such
> concurrent cases users can anyway now also get conflicts, it is just a
> matter of timing. One more thing to check transaction dependency, we
> might need to spill the data for streaming transactions in which case
> we might lose all the benefits of doing it via a background worker. Do
> we see any simple way to avoid this?
>

Avoiding unexpected differences like this is why I suggested the
option should have to be explicitly enabled instead of being on by
default as it is in the current patch. See my review comment #14 [1].
It means the user won't have to change their existing code as a
workaround.

------
[1] https://www.postgresql.org/message-id/CAHut%2BPuqYP5eD5wcSCtk%3Da6KuMjat2UCzqyGoE7sieCaBsVskQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amul Sul
Дата:
Сообщение: Re: Proposal for internal Numeric to Uint64 conversion function.
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: failures in t/031_recovery_conflict.pl on CI