On Tue, Jan 23, 2024, at 10:29 PM, Euler Taveira wrote:
I'll post a new one soon.
I'm attaching another patch that fixes some of the issues pointed out by
Hayato, Shlok, and Junwang.
* publication doesn't exist. The analysis [1] was done by Hayato but I didn't
use the proposed patch. Instead I refactored the code a bit [2] and call it
from a new function (setup_publisher) that is called before the promotion.
* fix wrong path name in the initial comment [3]
* change terminology: logical replica -> physical replica [3]
* primary / standby is ready for logical replication? setup_publisher() and
setup_subscriber() check if required GUCs are set accordingly. For primary,
it checks wal_level = logical, max_replication_slots has remain replication
slots for the proposed setup and also max_wal_senders available. For standby,
it checks max_replication_slots for replication origin and also remain number
of background workers to start the subscriber.
* retain option: I extracted this one from Hayato's patch [4]
* target server must be a standby. It seems we agree that this restriction
simplifies the code a bit but can be relaxed in the future (if/when base
backup support is added.)
* recovery timeout option: I decided to include it but I think the use case is
too narrow. It helps in broken setups. However, it can be an issue in some
scenarios like time-delayed replica, large replication lag, slow hardware,
slow network. I didn't use the proposed patch [5]. Instead, I came up with a
simple one that defaults to forever. The proposed one defaults to 60 seconds
but I'm afraid that due to one of the scenarios I said in a previous
sentence, we cancel a legitimate case. Maybe we should add a message during
dry run saying that due to a replication lag, it will take longer to run.
* refactor primary_slot_name code. With the new setup_publisher and
setup_subscriber functions, I splitted the function that detects the
primary_slot_name use into 2 pieces just to avoid extra connections to have
the job done.
* remove fallback_application_name as suggested by Hayato [5] because logical
replication already includes one.
I'm still thinking about replacing --subscriber-conninfo with separate items
(username, port, password?, host = socket dir). Maybe it is an overengineering.
The user can always prepare the environment to avoid unwanted and/or external
connections.
I didn't change the name from pg_subscriber to pg_createsubscriber yet but if I
didn't hear objections about it, I'll do it in the next patch.