Re: Skipping logical replication transactions on subscriber side

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Skipping logical replication transactions on subscriber side
Дата
Msg-id CAD21AoDHMLiktd=x9eeEN3-kXpP8Dbz_CHOz1PYy8Mmxw8edZQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Skipping logical replication transactions on subscriber side  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Skipping logical replication transactions on subscriber side  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jan 11, 2022 at 3:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jan 11, 2022 at 8:52 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Jan 10, 2022 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > I was thinking what if we don't advance origin explicitly in this
> > > > > case? Actually, that will be no different than the transactions where
> > > > > the apply worker doesn't apply any change because the initial sync is
> > > > > in progress (see should_apply_changes_for_rel()) or we have received
> > > > > an empty transaction. In those cases also, the origin lsn won't be
> > > > > advanced even though we acknowledge the advanced last_received
> > > > > location because of keep_alive messages. Now, it is possible after the
> > > > > restart we send the old start_lsn location because the replication
> > > > > origin was not updated before restart but we handle that case in the
> > > > > server by starting from the last confirmed location. See below code:
> > > > >
> > > > > CreateDecodingContext()
> > > > > {
> > > > > ..
> > > > > else if (start_lsn < slot->data.confirmed_flush)
> > > > > ..
> > > >
> > > > Good point. Probably one minor thing that is different from the
> > > > transaction where the apply worker applied an empty transaction is a
> > > > case where the server restarts/crashes before sending an
> > > > acknowledgment of the flush location. That is, in the case of the
> > > > empty transaction, the publisher sends an empty transaction again. On
> > > > the other hand in the case of skipping the transaction, a non-empty
> > > > transaction will be sent again but skip_xid is already changed or
> > > > cleared, therefore the user will have to specify skip_xid again. If we
> > > > write replication origin WAL record to advance the origin lsn, it
> > > > reduces the possibility of that. But I think it’s a very minor case so
> > > > we won’t need to deal with that.
> > > >
> > >
> > > Yeah, in the worst case, it will lead to conflict again and the user
> > > needs to set the xid again.
> >
> > On second thought, the same is true for other cases, for example,
> > preparing the transaction and clearing skip_xid while handling a
> > prepare message. That is, currently we don't clear skip_xid while
> > handling a prepare message but do that while handling commit/rollback
> > prepared message, in order to avoid the worst case. If we do both
> > while handling a prepare message and the server crashes between them,
> > it ends up that skip_xid is cleared and the transaction will be
> > resent, which is identical to the worst-case above.
> >
>
> How are you thinking to update the skip xid before prepare? If we do
> it in the same transaction then the changes in the catalog will be
> part of the prepared xact but won't be committed. Now, say if we do it
> after prepare, then the situation won't be the same because after
> restart the same xact won't appear again.

I was thinking to commit the catalog change first in a separate
transaction while not updating origin LSN and then prepare an empty
transaction while updating origin LSN. If the server crashes between
them, the skip_xid is cleared but the transaction will be resent.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Skipping logical replication transactions on subscriber side