Re: [Patch] add new parameter to pg_replication_origin_session_setup
| От | Amit Kapila |
|---|---|
| Тема | Re: [Patch] add new parameter to pg_replication_origin_session_setup |
| Дата | |
| Msg-id | CAA4eK1KkqQD88Td53v6Z=adGSMLR7wN543WhqccOW17ykt-QDg@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: [Patch] add new parameter to pg_replication_origin_session_setup (shveta malik <shveta.malik@gmail.com>) |
| Ответы |
RE: [Patch] add new parameter to pg_replication_origin_session_setup
|
| Список | pgsql-hackers |
On Mon, Jan 5, 2026 at 4:00 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Jan 5, 2026 at 3:15 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Dec 23, 2025 at 2:24 PM Zhijie Hou (Fujitsu)
> > <houzj.fnst@fujitsu.com> wrote:
> > >
> > > Hi,
> > >
> > > When testing the new parameter in pg_replication_origin_session_setup(), I
> > > noticed a bug allowing the origin in use to be dropped. The issue arises when
> > > two backends set up the same origin; if the second backend resets the origin
> > > first, it resets the acquired_by flag regardless of whether the first backend is
> > > using it. This allows the origin to be dropped, enabling the slot in shared
> > > memory to be reused, which is unintended.
> > >
> > > About the fix, simply adding a check for acquired_by field does not work,
> > > because if the first backend resets the origin first, it still risks being
> > > dropped while second backend uses it.
> > >
> > > To fully resolve this, I tried to add a reference count (refcount) for the
> > > origin. The count is incremented when a backend sets up the origin and
> > > decremented upon a reset. As a result, the replication origin is only dropped
> > > when the reference count reaches zero.
> > >
> > > Thanks to Kuroda-San for discussing and reviewing this patch off-list.
> > >
> >
> > Thanks Hou-San and Kuroda-San.
> >
> > What should be the expected behavior when Session1 resets the origin
> > (changing acquired_pid from its own PID to 0), while Session2 is
> > already connected to the origin and Session3 also attempts to reuse
> > the same origin?
> >
> > Currently it asserts:
> >
> > Session1:
> > select pg_replication_origin_create('origin');
> > SELECT pg_replication_origin_session_setup('origin');
> >
> > Session2:
> > SELECT pg_replication_origin_session_setup('origin',48028);
> >
> > Session1:
> > SELECT pg_replication_origin_session_reset();
> >
> > Session3:
> > SELECT pg_replication_origin_session_setup('origin');
> > This asserts at:
> > TRAP: failed Assert("session_replication_state->refcount == 0"), File:
> > "origin.c", Line: 1231, PID: 48037
> >
>
> I checked the behavior on HEAD. Session3 is able to set up the origin
> and sets its own PID in acquired_pid. But it is unclear to me which
> PID should be recorded in acquired_pid - Session2’s PID, since it set
> up the origin earlier, or Session3’s PID. Or does this even make any
> difference?
>
> I found one more related issue on HEAD, sharing it here:
>
> When the first backend creates and sets up the origin, followed by a
> second backend setting it up, and then the first backend resets it
> while the second backend attempts to drop it, an assertion is
> triggered:
> TRAP: failed Assert("session_replication_state->roident !=
> InvalidRepOriginId"), File: "origin.c", Line: 1257, PID: 48438
>
Can we address these problems by prohibiting leader worker to reset
when pa workers are still associated with the origin? The way for
leader to know if pa workers are associated with origin is by checking
following condition: acquired_by == MyProcpid AND refcount > 1.
--
With Regards,
Amit Kapila.
В списке pgsql-hackers по дате отправления: