RE: Newly created replication slot may be invalidated by checkpoint
| От | Vitaly Davydov |
|---|---|
| Тема | RE: Newly created replication slot may be invalidated by checkpoint |
| Дата | |
| Msg-id | 324b22-691ed080-3-51e9c380@77308141 обсуждение исходный текст |
| Ответ на | RE: Newly created replication slot may be invalidated by checkpoint ("Zhijie Hou (Fujitsu)" <houzj.fnst@fujitsu.com>) |
| Ответы |
RE: Newly created replication slot may be invalidated by checkpoint
|
| Список | pgsql-hackers |
Hi Zhijie Hou I'm not sure, my previous email was successfully sent. I do not see it in the pgsql-hackers mailing list. Re-sending it again, sorry for that. Thank you for preparing the patch! The new lock schema (Allocation, Control) seems to be working, without any deadlocks. It is guaranteed by the lock orders - (1) Allocation lock is followed by the Control lock, or (2) the Control lock without the Allocation lock. I've attached the doc [1] where I tried to describe the lock schema to analyze possible deadlocks. I haven't found any evident problems. Concerning reserve_wal_for_local_slot. It seems it is used in synchronization of failover logical slots. For me, it is tricky to change restart_lsn of a synced logical slot to RedoRecPtr, because it may lead to problems with logical replication using such slot after the replica promotion. But it seems it is the architectural problem and it is not related to the problems, solved by the patch. The change of lock mode to EXCLUSIVE in ReplicationSlotsComputeRequiredLSN may affect the performance when a lot of slots are advanced during some small period of time. It may affect the walsender performance. It advances the logical or physical slots when receive a confirmation from the replica. I guess, the slot advancement may be pretty frequent operation. There is an idea to keep the SHARED lock but put XLogSetReplicationSlotMinimumLSN under this lock and implement some guarantees that we do not advance slots to the past, taking into account that the slot can be advanced in past if it doesn't cross wal segment boundaries (it happens). In this case, if a concurrent process advances an existing slot, its old restart_lsn will protect the wal. In case of wal reservation we may use EXCLUSIVE lock in XLogSetReplicationSlotMinimumLSN. Furthremore, I believe, ReplicationSlotsComputeRequiredLSN is required for checkpointer to calculate the oldest wal segment but it is not required to be called every time when a slot is advanced. It may affect the reports - GetWALAvailability function, but I think it is not a big problem to deal with it. Some typos in the patch commit message: 1) A typo: yet updated estart_lsn, while the 2) If a backend advances a slot's restart_lsn reaches XLogSetReplicationSlotMinimumLSN... - may be to put 'and' before reaches? [1] lockschema.md With best regards, Vitaly
Вложения
В списке pgsql-hackers по дате отправления: