Re: Synchronizing slots from primary to standby
От | Hsu, John |
---|---|
Тема | Re: Synchronizing slots from primary to standby |
Дата | |
Msg-id | 2415E2B4-F79E-4C24-A28E-78D40721D08F@amazon.com обсуждение исходный текст |
Ответ на | Re: Synchronizing slots from primary to standby (Peter Eisentraut <peter.eisentraut@enterprisedb.com>) |
Список | pgsql-hackers |
Hello, I started taking a brief look at the v2 patch, and it does appear to work for the basic case. Logical slot is synchronizedacross and I can connect to the promoted standby and stream changes afterwards. It's not clear to me what the correct behavior is when a logical slot that has been synced to the replica and then it getsdeleted on the writer. Would we expect this to be propagated or leave it up to the end-user to manage? > + rawname = pstrdup(standby_slot_names); > + SplitIdentifierString(rawname, ',', &namelist); > + > + while (true) > + { > + int wait_slots_remaining; > + XLogRecPtr oldest_flush_pos = InvalidXLogRecPtr; > + int rc; > + > + wait_slots_remaining = list_length(namelist); > + > + LWLockAcquire(ReplicationSlotControlLock, LW_SHARED); > + for (int i = 0; i < max_replication_slots; i++) > + { Even though standby_slot_names is PGC_SIGHUP, we never reload/re-process the value. If we have a wrong entry in there, thebackend becomes stuck until we re-establish the logical connection. Adding "postmaster/interrupt.h" with ConfigReloadPending/ ProcessConfigFile does seem to work. Another thing I noticed is that once it starts waiting in this block, Ctrl+C doesn't seem to terminate the backend? pg_recvlogical -d postgres -p 5432 --slot regression_slot --start -f - .. ^Cpg_recvlogical: error: unexpected termination of replication stream: The logical backend connection is still present: ps aux | grep 51263 hsuchen 51263 80.7 0.0 320180 14304 ? Rs 01:11 3:04 postgres: walsender hsuchen [local] START_REPLICATION pstack 51263 #0 0x00007ffee99e79a5 in clock_gettime () #1 0x00007f8705e88246 in clock_gettime () from /lib64/libc.so.6 #2 0x000000000075f141 in WaitEventSetWait () #3 0x000000000075f565 in WaitLatch () #4 0x0000000000720aea in ReorderBufferProcessTXN () #5 0x00000000007142a6 in DecodeXactOp () #6 0x000000000071460f in LogicalDecodingProcessRecord () It can be terminated with a pg_terminate_backend though. If we have a physical slot with name foo on the standby, and then a logical slot is created on the writer with the same slot_nameit does error out on the replica although it prevents other slots from being synchronized which is probably fine. 2021-12-16 02:10:29.709 UTC [73788] LOG: replication slot synchronization worker for database "postgres" has started 2021-12-16 02:10:29.713 UTC [73788] ERROR: cannot use physical replication slot for logical decoding 2021-12-16 02:10:29.714 UTC [73037] DEBUG: unregistering background worker "replication slot synchronization worker" On 12/14/21, 2:26 PM, "Peter Eisentraut" <peter.eisentraut@enterprisedb.com> wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you canconfirm the sender and know the content is safe. On 28.11.21 07:52, Bharath Rupireddy wrote: > 1) Instead of a new LIST_SLOT command, can't we use > READ_REPLICATION_SLOT (slight modifications needs to be done to make > it support logical replication slots and to get more information from > the subscriber). I looked at that but didn't see an obvious way to consolidate them. This is something we could look at again later. > 2) How frequently the new bg worker is going to sync the slot info? > How can it ensure that the latest information exists say when the > subscriber is down/crashed before it picks up the latest slot > information? The interval is currently hardcoded, but could be a configuration setting. In the v2 patch, there is a new setting that orders physical replication before logical so that the logical subscribers cannot get ahead of the physical standby. > 3) Instead of the subscriber pulling the slot info, why can't the > publisher (via the walsender or a new bg worker maybe?) push the > latest slot info? I'm not sure we want to add more functionality to > the walsender, if yes, isn't it going to be much simpler? This sounds like the failover slot feature, which was rejected.
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Michael PaquierДата:
Сообщение: Re: pg_upgrade should truncate/remove its logs before running
Следующее
От: "wangw.fnst@fujitsu.com"Дата:
Сообщение: RE: Confused comment about drop replica identity index