Re: Excessive number of replication slots for 12->14 logical replication

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Excessive number of replication slots for 12->14 logical replication
Дата
Msg-id CAA4eK1JK0zs0H27-i3Pko2na-xt2vrO2r6Y3mn+sfr9CWzobBg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Excessive number of replication slots for 12->14 logical replication  (Ajin Cherian <itsajin@gmail.com>)
Ответы Re: Excessive number of replication slots for 12->14 logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-bugs
On Thu, Aug 25, 2022 at 7:38 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Wed, Aug 24, 2022 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Aug 23, 2022 at 7:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Aug 18, 2022 at 11:28 AM Ajin Cherian <itsajin@gmail.com> wrote:
> > > >
> > > > On Thu, Aug 18, 2022 at 3:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > Looks good.
> > > > >
> > > > > I have one minor comment:
> > > > >
> > > > > -                    * SUBREL_STATE_FINISHEDCOPY. The apply worker can also
> > > > > -                    * concurrently try to drop the origin and by this time
> > > > > -                    * the origin might be already removed. For these reasons,
> > > > > -                    * passing missing_ok = true.
> > > > > +                    * SUBREL_STATE_FINISHEDCOPY. So passing missing_ok = true.
> > > > >
> > > > > I think we should change "the apply worker" to "the tablesync worker"
> > > > > but should not remove this sentence. The fact that another process
> > > > > could concurrently try to drop the origin is still true.
> > > > >
> > > > > The rest looks good to me.
> > > > >
> > > >
> > > > Updated as described.
> > > >
> > >
> > > The patch looks good to me though I would like to test it a bit more
> > > before pushing.
> > >
> >
> > While testing/reviewing it further, I noticed that the patch has used
> > missing_ok as true when dropping origin via tablesync worker. I don't
> > think that is correct because the concurrent operations that remove
> > origin like a refresh for the subscription take an access exclusive
> > lock on pg_subscription which prevent the previous operation to update
> > the rel state to SUBREL_STATE_SYNCDONE to succeed. So, I think we
> > should pass missing_ok as false which would be consistent with slot
> > handling. I have changed that and comments a few places. What do you
> > think of the attached?
> >
>
> Yes, this makes sense. I tested with a debugger to create simultaneous
> alter subscription refresh publication AND tablesync worker
> deleting the tracking origin. What I see is that the alter subscription command
> gets the lock on SubscriptionRelRelationId  and then actually goes on to
> kill the tablesync worker. Even if the tablesync worker is waiting for
> the same lock,
> it looks like the SIGTERM is being handled and the worker is killed.
>
> From my testing, there doesn't seem to be a case where the tablesync worker
> would try and drop a previously dropped origin. So, I think we can pass
> missing_ok as 'false' while dropping origin. The patch looks good to me.
>

Thanks for the testing. I'll push this sometime early next week (by
Tuesday) unless Sawada-San or someone else has any comments on it.

-- 
With Regards,
Amit Kapila.



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Scott Zelenka
Дата:
Сообщение: Re: BUG #17593: min key size 112 bits from FIPS SP800-131 requirement for HMAC-SHA raises exception in SCRAM-SHA-256
Следующее
От: PG Bug reporting form
Дата:
Сообщение: BUG #17596: "invalid attribute number 11" when updating partitioned table with a MULTIEXPR_SUBLINK