Re: Skipping logical replication transactions on subscriber side

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Skipping logical replication transactions on subscriber side
Дата
Msg-id CAD21AoBbdbuOcNX=kp-ggRNZVQPbUfCC+CsMaRX5J2msuPhyrg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Skipping logical replication transactions on subscriber side  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Skipping logical replication transactions on subscriber side  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Tue, Jun 1, 2021 at 2:28 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Jun 1, 2021 at 10:07 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jun 1, 2021 at 1:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Jun 1, 2021 at 12:55 AM Peter Eisentraut
> > > <peter.eisentraut@enterprisedb.com> wrote:
> > > >
> > > > On 27.05.21 12:04, Amit Kapila wrote:
> > > > >>> Also, I am thinking that instead of a stat view, do we need
> > > > >>> to consider having a system table (pg_replication_conflicts or
> > > > >>> something like that) for this because what if stats information is
> > > > >>> lost (say either due to crash or due to udp packet loss), can we rely
> > > > >>> on stats view for this?
> > > > >> Yeah, it seems better to use a catalog.
> > > > >>
> > > > > Okay.
> > > >
> > > > Could you store it shared memory?  You don't need it to be crash safe,
> > > > since the subscription will just run into the same error again after
> > > > restart.  You just don't want it to be lost, like with the statistics
> > > > collector.
> > > >
> > >
> > > But, won't that be costly in cases where we have errors in the
> > > processing of very large transactions? Subscription has to process all
> > > the data before it gets an error.
> >
> > I had the same concern. Particularly, the approach we currently
> > discussed is to skip the transaction based on the information written
> > by the worker rather than require the user to specify the XID.
> >
>
> Yeah, but I was imagining that the user still needs to specify
> something to indicate that we need to skip it, otherwise, we might try
> to skip a transaction that the user wants to resolve by itself rather
> than expecting us to skip it.

Yeah, currently what I'm thinking is that the worker writes the
conflict that caused an error somewhere. If the user wants to resolve
it manually they can specify the resolution method to the stopped
subscription. Until the user specifies the method and the worker
resolves it or some fields of the subscription such as subconninfo are
updated, the conflict is not resolved and the information lasts.

>
> > > I think the XID (or say another identifier like commitLSN) which we
> > > want to use for skipping the transaction as specified by the user has
> > > to be stored in the catalog because otherwise, after the restart we
> > > won't remember it and the user won't know that he needs to set it
> > > again. Now, say we have multiple skip identifiers (XIDs, commitLSN,
> > > ..), isn't it better to store all conflict-related information in a
> > > separate catalog like pg_subscription_conflict or something like that.
> > > I think it might be also better to later extend it for auto conflict
> > > resolution where the user can specify auto conflict resolution info
> > > for a subscription. Is it better to store all such information in
> > > pg_subscription or have a separate catalog? It is possible that even
> > > if we have a separate catalog for conflict info, we might not want to
> > > store error info there.
> >
> > Just to be clear, we need to store only the conflict-related
> > information that cannot be resolved without manual intervention,
> > right? That is, conflicts cause an error, exiting the workers. In
> > general, replication conflicts include also conflicts that don’t cause
> > an error. I think that those conflicts don’t necessarily need to be
> > stored in the catalog and don’t require manual intervention.
> >
>
> Yeah, I think we want to record the error cases but which other
> conflicts you are talking about here which doesn't lead to any sort of
> error?

For example, I think it's one type of replication conflict that two
updates that arrived via logical replication or from the client update
the same record (e.g., having the same primary key) at the same time.
In that case an error doesn't happen and we always choose the update
that arrived later. But there are other possible resolution methods
such as choosing the one that arrived former, using the one having a
newer commit timestamp, using something like priority of the node, and
even raising an error so that the user manually resolves it.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Alias collision in `refresh materialized view concurrently`
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Re: A new function to wait for the backend exit after termination