Обсуждение: pg_createsubscriber: drop pre-existing subscriptions from the converted node
pg_createsubscriber: drop pre-existing subscriptions from the converted node
От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Hackers, This is a follow-up thread for pg_createsubscriber [1]. I started a new thread since there is no activity around here. ## Problem Assuming that there is a cascading replication like below: node A --(logical replication)--> node B --(streaming replication)--> node C In this case, subscriptions exist even on node C, but it does not try to connect to node A because the logical replication launcher/worker won't be launched. After the conversion, node C becomes a subscriber for node B, and the subscription toward node A remains. Therefore, another worker that tries to connect to node A will be launched, raising an ERROR [2]. This failure may occur even during the conversion. ## Solution The easiest solution is to drop pre-existing subscriptions from the converted node. To avoid establishing connections during the conversion, slot_name is set to NONE on the primary first, then drop on the standby. The setting will be restored on the primary node. Attached patch implements the idea. Test script is also included, but not sure it should be on the HEAD BTW, I found that LogicalRepInfo.oid won't be used. If needed, I can create another patch to remove the attribute. How do you think? [1]: https://www.postgresql.org/message-id/CAA4eK1J22UEfrqx222h5j9DQ7nxGrTbAa_BC%2B%3DmQXdXs-RCsew%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CANhcyEWvimA1-f6hSrA%3D9qkfR5SonFb56b36M%2B%2BvT%3DLiFj%3D76g%40mail.gmail.com Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/
Вложения
On Fri, 21 Jun 2024 at 16:51, Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > Dear Hackers, > > This is a follow-up thread for pg_createsubscriber [1]. I started a new thread > since there is no activity around here. > > ## Problem > > Assuming that there is a cascading replication like below: > > node A --(logical replication)--> node B --(streaming replication)--> node C > > In this case, subscriptions exist even on node C, but it does not try to connect > to node A because the logical replication launcher/worker won't be launched. > After the conversion, node C becomes a subscriber for node B, and the subscription > toward node A remains. Therefore, another worker that tries to connect to node A > will be launched, raising an ERROR [2]. This failure may occur even during the > conversion. > > ## Solution > > The easiest solution is to drop pre-existing subscriptions from the converted node. > To avoid establishing connections during the conversion, slot_name is set to NONE > on the primary first, then drop on the standby. The setting will be restored on the > primary node. > Attached patch implements the idea. Test script is also included, but not sure it should > be on the HEAD Few comments: 1) Should we do this only for the enabled subscription, otherwise the disabled subscriptions will be enabled after running pg_createsubscriber: +obtain_and_disable_pre_existing_subscriptions(struct LogicalRepInfo *dbinfo) +{ + PQExpBuffer query = createPQExpBuffer(); + + for (int i = 0; i < num_dbs; i++) + { + PGconn *conn; + PGresult *res; + int ntups; + + /* Connect to publisher */ + conn = connect_database(dbinfo[i].pubconninfo, true); + + appendPQExpBuffer(query, + "SELECT s.subname, s.subslotname FROM pg_catalog.pg_subscription s " + "INNER JOIN pg_catalog.pg_database d ON (s.subdbid = d.oid) " + "WHERE d.datname = '%s'", + dbinfo[i].dbname); + 2) disconnect_database not called here, should the connection be disconnected: +drop_pre_existing_subscriptions(struct LogicalRepInfo *dbinfo) +{ + PQExpBuffer query = createPQExpBuffer(); + + for (int i = 0; i < num_dbs; i++) + { + PGconn *conn; + struct LogicalRepInfo info = dbinfo[i]; + + /* Connect to subscriber */ + conn = connect_database(info.subconninfo, false); + + for (int j = 0; j < info.num_subscriptions; j++) + { + appendPQExpBuffer(query, + "DROP SUBSCRIPTION %s;", info.pre_subnames[j]); + PQexec(conn, query->data); + resetPQExpBuffer(query); + } + } 3) Similarly here too: +static void +enable_subscirptions_on_publisher(struct LogicalRepInfo *dbinfo) +{ + PQExpBuffer query = createPQExpBuffer(); + + for (int i = 0; i < num_dbs; i++) + { + PGconn *conn; + struct LogicalRepInfo info = dbinfo[i]; + + /* Connect to publisher */ + conn = connect_database(info.pubconninfo, false); 4) them should be then here: + /* ...and them enable the subscription */ + appendPQExpBuffer(query, + "ALTER SUBSCRIPTION %s ENABLE", + info.pre_subnames[j]); + PQclear(PQexec(conn, query->data)); + resetPQExpBuffer(query); > BTW, I found that LogicalRepInfo.oid won't be used. If needed, I can create > another patch to remove the attribute. I was able to compile without this, I think this can be removed. Regards, Vignesh
Re: pg_createsubscriber: drop pre-existing subscriptions from the converted node
От
Amit Kapila
Дата:
On Fri, Jun 21, 2024 at 4:51 PM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > This is a follow-up thread for pg_createsubscriber [1]. I started a new thread > since there is no activity around here. > > ## Problem > > Assuming that there is a cascading replication like below: > > node A --(logical replication)--> node B --(streaming replication)--> node C > > In this case, subscriptions exist even on node C, but it does not try to connect > to node A because the logical replication launcher/worker won't be launched. > After the conversion, node C becomes a subscriber for node B, and the subscription > toward node A remains. Therefore, another worker that tries to connect to node A > will be launched, raising an ERROR [2]. This failure may occur even during the > conversion. > > ## Solution > > The easiest solution is to drop pre-existing subscriptions from the converted node. > To avoid establishing connections during the conversion, slot_name is set to NONE > on the primary first, then drop on the standby. The setting will be restored on the > primary node. > It seems disabling subscriptions on the primary can make the primary stop functioning for some duration of time. I feel we need some solution where after converting to subscriber, we disable and drop pre-existing subscriptions. One idea could be that we use the list of new subscriptions created by the tool such that any subscription not existing in that list will be dropped. Shouldn't this be an open item for PG17? -- With Regards, Amit Kapila.
RE: pg_createsubscriber: drop pre-existing subscriptions from the converted node
От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Amit, Vingesh, Thanks for giving comments! > It seems disabling subscriptions on the primary can make the primary > stop functioning for some duration of time. I feel we need some > solution where after converting to subscriber, we disable and drop > pre-existing subscriptions. One idea could be that we use the list of > new subscriptions created by the tool such that any subscription not > existing in that list will be dropped. Previously I avoided coding like yours, because there is a room that converted node can connect to another publisher. But per off-list discussion, we can skip it by setting max_logical_replication_workers = 0. I refactored with the approach. Note that the GUC is checked at verification phase, so an attribute is added to start_standby_server() to select the workload. Most of comments by Vignesh were invalidated due to the code change, but I hoped I checked your comments were not reproduced. Also, 0001 was created to remove an unused attribute. > Shouldn't this be an open item for PG17? Added this thread to wikipage. Best Regards, Hayato Kuroda FUJITSU LIMITED https://www.fujitsu.com/
Вложения
Re: pg_createsubscriber: drop pre-existing subscriptions from the converted node
От
Amit Kapila
Дата:
On Thu, Jun 27, 2024 at 11:47 AM Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com> wrote: > > > It seems disabling subscriptions on the primary can make the primary > > stop functioning for some duration of time. I feel we need some > > solution where after converting to subscriber, we disable and drop > > pre-existing subscriptions. One idea could be that we use the list of > > new subscriptions created by the tool such that any subscription not > > existing in that list will be dropped. > > Previously I avoided coding like yours, because there is a room that converted > node can connect to another publisher. But per off-list discussion, we can skip > it by setting max_logical_replication_workers = 0. I refactored with the approach. > Note that the GUC is checked at verification phase, so an attribute is added to > start_standby_server() to select the workload. > Thanks, this is a better approach. I have changed a few comments and made some other cosmetic changes. See attached. Euler, Peter E., and others, do you have any comments/suggestions? BTW, why have you created a separate test file for this test? I think we should add a new test to one of the existing tests in 040_pg_createsubscriber. You can create a dummy subscription on node_p and do a test similar to what we are doing in "# Create failover slot to test its removal". -- With Regards, Amit Kapila.