Re: Logical replication fails when adding multiple replicas

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Logical replication fails when adding multiple replicas
Дата
Msg-id 20230323.171742.1357157542021128059.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Logical replication fails when adding multiple replicas  (Will Roper <will.roper@democracyclub.org.uk>)
Ответы Re: Logical replication fails when adding multiple replicas  (Will Roper <will.roper@democracyclub.org.uk>)
Список pgsql-general
At Wed, 22 Mar 2023 09:25:37 +0000, Will Roper <will.roper@democracyclub.org.uk> wrote in 
> Thanks for the response Hou,
> 
> I've had a look and when the tablesync workers are spinning up there are
> some errors of the form:
> 
> "2023-03-17 18:37:06.900 UTC [4071] LOG:  logical replication table
> synchronization worker for subscription
> ""polling_stations_0561a02f66363d911"", table ""uk_geo_utils_onspd"" has
> started"
> "2023-03-17 18:37:06.976 UTC [4071] ERROR:  could not create replication
> slot ""pg_37986_sync_37922_7210774007126708177"": ERROR:  replication slot
> ""pg_37986_sync_37922_7210774007126708177"" already exists"

The slot name format is "pg_<suboid>_sync_<relid>_<systemid>". It's no
surprise this happens if the subscribers come from the same
backup.

If that's true, the simplest workaround would be to recreate the
subscription multiple times, using a different number of repetitions
for each subscriber so that the subscribers have subscriptions with
different OIDs.



I believe it's not prohitibed for subscribers to have the same system
identifer, but the slot name generation logic for tablesync doesn't
account for cases like this.  We might need some server-wide value
that's unique among subscribers and stable while table sync is
running.  I can't think of a better place than pg_subscription but I
don't like it because it's not really necessary most of the the
subscription's life.

Do you think using the postmaster's startup time would work for this
purpose?  I'm assuming that the slot name doesn't need to persist
across server restarts, but I'm not sure that's really true.


diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c
index 07eea504ba..a5b4f7cf7c 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -1214,7 +1214,7 @@ ReplicationSlotNameForTablesync(Oid suboid, Oid relid,
                                 char *syncslotname, Size szslot)
 {
     snprintf(syncslotname, szslot, "pg_%u_sync_%u_" UINT64_FORMAT, suboid,
-             relid, GetSystemIdentifier());
+             relid, PgStartTime);
 }
 
 /*


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Is the PL/pgSQL refcursor useful in a modern three-tier app?
Следующее
От: Dominique Devienne
Дата:
Сообщение: Convert pg_constraint.conkey array to same-order array of column names