Re: Excessive number of replication slots for 12->14 logical replication

Поиск
Список
Период
Сортировка
От Ajin Cherian
Тема Re: Excessive number of replication slots for 12->14 logical replication
Дата
Msg-id CAFPTHDbY_DOr4zWQzNR5ZaC1eEWz5JtFdH1mp2jKbT83CbaRuw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Excessive number of replication slots for 12->14 logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Excessive number of replication slots for 12->14 logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-bugs
On Wed, Aug 24, 2022 at 3:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Aug 23, 2022 at 7:56 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Aug 18, 2022 at 11:28 AM Ajin Cherian <itsajin@gmail.com> wrote:
> > >
> > > On Thu, Aug 18, 2022 at 3:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > Looks good.
> > > >
> > > > I have one minor comment:
> > > >
> > > > -                    * SUBREL_STATE_FINISHEDCOPY. The apply worker can also
> > > > -                    * concurrently try to drop the origin and by this time
> > > > -                    * the origin might be already removed. For these reasons,
> > > > -                    * passing missing_ok = true.
> > > > +                    * SUBREL_STATE_FINISHEDCOPY. So passing missing_ok = true.
> > > >
> > > > I think we should change "the apply worker" to "the tablesync worker"
> > > > but should not remove this sentence. The fact that another process
> > > > could concurrently try to drop the origin is still true.
> > > >
> > > > The rest looks good to me.
> > > >
> > >
> > > Updated as described.
> > >
> >
> > The patch looks good to me though I would like to test it a bit more
> > before pushing.
> >
>
> While testing/reviewing it further, I noticed that the patch has used
> missing_ok as true when dropping origin via tablesync worker. I don't
> think that is correct because the concurrent operations that remove
> origin like a refresh for the subscription take an access exclusive
> lock on pg_subscription which prevent the previous operation to update
> the rel state to SUBREL_STATE_SYNCDONE to succeed. So, I think we
> should pass missing_ok as false which would be consistent with slot
> handling. I have changed that and comments a few places. What do you
> think of the attached?
>

Yes, this makes sense. I tested with a debugger to create simultaneous
alter subscription refresh publication AND tablesync worker
deleting the tracking origin. What I see is that the alter subscription command
gets the lock on SubscriptionRelRelationId  and then actually goes on to
kill the tablesync worker. Even if the tablesync worker is waiting for
the same lock,
it looks like the SIGTERM is being handled and the worker is killed.

From my testing, there doesn't seem to be a case where the tablesync worker
would try and drop a previously dropped origin. So, I think we can pass
missing_ok as 'false' while dropping origin. The patch looks good to me.

regards,
Ajin Cherian
Fujitsu Australia



В списке pgsql-bugs по дате отправления:

Предыдущее
От: hubert depesz lubaczewski
Дата:
Сообщение: pg_restore deadlocks with itself
Следующее
От: PG Bug reporting form
Дата:
Сообщение: BUG #17593: min key size 112 bits from FIPS SP800-131 requirement for HMAC-SHA raises exception in SCRAM-SHA-256