Re: Potential data loss due to race condition during logical replication slot creation

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Potential data loss due to race condition during logical replication slot creation
Дата
Msg-id CAD21AoDR3h78U0hxdzWPuPL11nvJCWYMB8h+QoOjd82ZmXjfgw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Potential data loss due to race condition during logical replication slot creation  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Potential data loss due to race condition during logical replication slot creation
Список pgsql-bugs
On Tue, Jun 25, 2024 at 1:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jun 24, 2024 at 10:32 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Jun 24, 2024 at 12:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Jun 21, 2024 at 12:16 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > The approach (a) has a downside, it will lead to tracking more
> > > > > transactions (non-catalog) than required without any benefit for the
> > > > > user. Considering that is true, I wouldn't prefer that approach.
> > > >
> > > > Yes, it will lead to tracking non-catalog-change transactions as well.
> > > > If there are many subtransactions, the overhead could be noticeable.
> > > > But it happens only once when creating a slot.
> > > >
> > >
> > > True, but it doesn't seem advisable to add such an overhead even
> > > during create time without any concrete reason.
> > >
> > > > Another variant of (a) is that we skip snapshot restores if the
> > > > initial_xmin_hirizon is a valid transaction id. The
> > > > initia_xmin_horizon is always set to a valida transaction id when
> > > > initializing the decoding context, e.g. during
> > > > CreateInitDecodingContext(). That way, we don't need to track
> > > > non-catalog-change transctions. A downside is that this approach
> > > > assumes that DecodingContextFindStartpoint() is called with the
> > > > decoding context created by CreateInitDecodingContxt(), which is true
> > > > in the core codes, but might not be true in third party extensions.
> > > >
> > >
> > > I think it is better to be explicit in this case rather than relying
> > > on initia_xmin_horizon. So, we can store in_create/create_in_progress
> > > flag in the Snapbuild in HEAD and store it in LogicalDecodingContext
> > > in back branches.
> >
> > I think we cannot access the flag in LogicalDecodingContext from
> > snapbuild.c at least in backbranches. I've discussed adding such a
> > flag in snapbuild.c as a global variable, but I'm slightly hesitant to
> > add a global variable besides InitialRunningXacts.
> >
>
> I agree that adding a global variable is not advisable. Can we pass
> the flag stored in LogicalDecodingContext to snapbuild.c?

Ah, I found a good path: snapbuild->reorder->private_data (storing a
pointer to a LogicalDecodingContext). This assumes private_data always
stores a pointer to a LogicalDecodingContext but I think that's find
at least for backbranches.

I've attached the patch for this idea for PG16.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Bowen Shi
Дата:
Сообщение: PG16 walsender hangs in ResourceArrayEnlarge using pgoutput
Следующее
От: "Zhijie Hou (Fujitsu)"
Дата:
Сообщение: RE: PG16 walsender hangs in ResourceArrayEnlarge using pgoutput