Re: Excessive number of replication slots for 12->14 logical replication

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: Excessive number of replication slots for 12->14 logical replication
Дата
Msg-id CAD21AoBGXi9nFsQTvgF-NfqcvoGOA9SrgSRwoEdGevEHvTm53g@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Excessive number of replication slots for 12->14 logical replication  (Ajin Cherian <itsajin@gmail.com>)
Ответы Re: Excessive number of replication slots for 12->14 logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Re: Excessive number of replication slots for 12->14 logical replication  (Ajin Cherian <itsajin@gmail.com>)
Список pgsql-bugs
On Mon, Aug 1, 2022 at 3:46 PM Ajin Cherian <itsajin@gmail.com> wrote:
>
> On Mon, Aug 1, 2022 at 11:50 AM Peter Smith <smithpb2250@gmail.com> wrote:
> >
> > Here is my review comment for v4-0001.
> >
> > (Just a nitpick about comments)
> >
> > ======
> >
> > 1. src/backend/replication/logical/tablesync.c - process_syncing_tables_for_sync
> >
> > There are really just 2 main things that are being done for the cleanup here:
> > - Drop the origin tracking
> > - Drop the tablesync slot
> >
> > So, IMO the code will have more clarity by having one bigger comment
> > for each of those drops, instead of commenting every step separately.
> >
> > e.g.
> >
> > BEFORE
> > /*
> > * Cleanup the tablesync origin tracking if exists.
> > */
> > ReplicationOriginNameForTablesync(MyLogicalRepWorker->subid,
> >   MyLogicalRepWorker->relid,
> >   originname,
> >   sizeof(originname));
> >
> > /*
> > * Reset the origin session before dropping.
> > *
> > * This is required to reset the ownership of the slot
> > * and allow the origin to be dropped.
> > */
> > replorigin_session_reset();
> >
> > replorigin_session_origin = InvalidRepOriginId;
> > replorigin_session_origin_lsn = InvalidXLogRecPtr;
> > replorigin_session_origin_timestamp = 0;
> >
> > /*
> > * There is a chance that the user is concurrently performing
> > * refresh for the subscription where we remove the table
> > * state and its origin and by this time the origin might be
> > * already removed. So passing missing_ok = true.
> > */
> > replorigin_drop_by_name(originname, true, false);
> >
> > /* Cleanup the tablesync slot. */
> > ReplicationSlotNameForTablesync(MyLogicalRepWorker->subid,
> > MyLogicalRepWorker->relid,
> > syncslotname,
> > sizeof(syncslotname));
> >
> > /*
> > * It is important to give an error if we are unable to drop the slot,
> > * otherwise, it won't be dropped till the corresponding subscription
> > * is dropped. So passing missing_ok = false.
> > */
> > ReplicationSlotDropAtPubNode(LogRepWorkerWalRcvConn, syncslotname, false);
> >
> >
> > SUGGESTION
> >
> > /*
> > * Cleanup the tablesync origin tracking.
> > *
> > * Resetting the origin session removes the ownership of the slot.
> > * This is needed to allow the origin to be dropped.
> > *
> > * Also, there is a chance that the user is concurrently performing
> > * refresh for the subscription where we remove the table
> > * state and its origin and by this time the origin might be
> > * already removed. So passing missing_ok = true.
> > */
> > ReplicationOriginNameForTablesync(MyLogicalRepWorker->subid,
> >   MyLogicalRepWorker->relid,
> >   originname,
> >   sizeof(originname));
> > replorigin_session_reset();
> > replorigin_session_origin = InvalidRepOriginId;
> > replorigin_session_origin_lsn = InvalidXLogRecPtr;
> > replorigin_session_origin_timestamp = 0;
> >
> > replorigin_drop_by_name(originname, true, false);
> >
> > /*
> > * Cleanup the tablesync slot.
> > *
> > * It is important to give an error if we are unable to drop the slot,
> > * otherwise, it won't be dropped till the corresponding subscription
> > * is dropped. So passing missing_ok = false.
> > */
> > ReplicationSlotNameForTablesync(MyLogicalRepWorker->subid,
> > MyLogicalRepWorker->relid,
> > syncslotname,
> > sizeof(syncslotname));
> > ReplicationSlotDropAtPubNode(LogRepWorkerWalRcvConn, syncslotname, false);
> >
> Updated.

Thank you for working on this. I have a comment and a question:

                  * This has to be done after updating the state
because otherwise if
                  * there is an error while doing the database
operations we won't be
-                 * able to rollback dropped slot.
+                 * able to rollback dropped slot or origin tracking.

I think we can actually roll back dropping the replication origin. So
the above comment is true for only replication slots.

---
+       replorigin_session_reset();
+       replorigin_session_origin = InvalidRepOriginId;
+       replorigin_session_origin_lsn = InvalidXLogRecPtr;
+       replorigin_session_origin_timestamp = 0;
+
+       replorigin_drop_by_name(originname, true, false);

With this change, the committing the ongoing transaction will be done
without replication origin. Is this okay? it's probably okay, but
since tablesync worker commits other changes with replication origin
I'm concerned a bit there might be a corner case.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



В списке pgsql-bugs по дате отправления:

Предыдущее
От: "Ravulapati, Gautham"
Дата:
Сообщение: ERROR: unterminated dollar-quoted string at or near "$$"
Следующее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: [PATCH] BUG FIX: inconsistent page found in BRIN_REGULAR_PAGE