Re: Skipping logical replication transactions on subscriber side

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Skipping logical replication transactions on subscriber side
Дата
Msg-id CAA4eK1KtWKpu3qEOff=2HLD3OiMMd9LKFMuWJzx9=MESaG4fJQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Skipping logical replication transactions on subscriber side  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Mon, May 31, 2021 at 12:39 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, May 29, 2021 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > > > > 1. the worker records the XID and commit LSN of the failed transaction
> > > > > to a catalog.
> > > > >
> > > >
> > > > When will you record this info? I am not sure if we can try to update
> > > > this when an error has occurred. We can think of using try..catch in
> > > > apply worker and then record it in catch on error but would that be
> > > > advisable? One random thought that occurred to me is to that apply
> > > > worker notifies such information to the launcher (or maybe another
> > > > process) which will log this information.
> > >
> > > Yeah, I was concerned about that too and had the same idea. The
> > > information still could not be written if the server crashes before
> > > the launcher writes it. But I think it's an acceptable.
> > >
> >
> > True, because even if the launcher restarts, the apply worker will
> > error out again and resend the information. I guess we can have an
> > error queue where apply workers can add their information and the
> > launcher will then process those.  If we do that, then we need to
> > probably define what we want to do if the queue gets full, either
> > apply worker nudge launcher and wait or it can just throw an error and
> > continue. If you have any better ideas to share this information then
> > we can consider those as well.
>
> +1 for using error queue. Maybe we need to avoid queuing the same
> error more than once to avoid the catalog from being updated
> frequently?
>

Yes, I think it is important because after logging the subscription
may still error again unless the user does something to skip or
resolve the conflict. I guess you need to check for the existence of
error in systable and or in the queue.

> >
> > >
> > > I guess we can update catalog tuples in place when another conflict
> > > happens next time. The catalog tuple should be fixed size. The
> > > already-resolved conflict will have the commit LSN older than its
> > > replication origin's LSN.
> > >
> >
> > Okay, but I have a slight concern that we will keep xid in the system
> > which might have been no longer valid. So, we will keep this info
> > about subscribers around till one performs drop subscription,
> > hopefully, that doesn't lead to too many rows. This will be okay as
> > per the current design but say tomorrow we decide to parallelize the
> > apply for a subscription then there could be multiple errors
> > corresponding to a subscription and in that case, such a design might
> > appear quite limiting. One possibility could be that when the launcher
> > is periodically checking for new error messages, it can clean up the
> > conflicts catalog as well, or maybe autovacuum does this periodically
> > as it does for stats (via pgstat_vacuum_stat).
>
> Yeah, it's better to have a way to cleanup no longer valid entries in
> the catalog in the case where the worker failed to remove it. I prefer
> the former idea so far,
>

Which idea do you refer to here as former (cleaning up by launcher)?

> so I'll implement it in a PoC patch.
>

Okay.

-- 
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: [HACKERS] logical decoding of two-phase transactions
Следующее
От: Ashutosh Bapat
Дата:
Сообщение: Re: postgres_fdw: Handle boolean comparison predicates