[HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION
Дата
Msg-id 20170201.173623.66249355.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответы Re: [HACKERS] Cannot shutdown subscriber after DROP SUBSCRIPTION  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-hackers
Hello, while looking another bug, I found that standby cannot
shutdown after DROP SUBSCRIPTION.

standby=# CREATE SUBSCRPTION sub1 ...
standby=# ....
standby=# DROP SUBSCRIPTION sub1;

Ctrl-C to the standby fails to work. ApplyLauncherMain is waiting
LogicalRepLauncherLock forever.

The culprit is DropSbuscription. It acquires
LogicalRepLauncherLock but never releases.

The attached patch fixes it. Most part of the fucntion is now
enclosed by PG_TRY-CATCH since some functions can throw
exceptions.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/commands/subscriptioncmds.c b/src/backend/commands/subscriptioncmds.c
index 5de9999..f07143e 100644
--- a/src/backend/commands/subscriptioncmds.c
+++ b/src/backend/commands/subscriptioncmds.c
@@ -511,51 +511,67 @@ DropSubscription(DropSubscriptionStmt *stmt)    /* Protect against launcher restarting the
worker.*/    LWLockAcquire(LogicalRepLauncherLock, LW_EXCLUSIVE);
 
-    /* Kill the apply worker so that the slot becomes accessible. */
-    logicalrep_worker_stop(subid);
-
-    /* Remove the origin tracking if exists. */
-    snprintf(originname, sizeof(originname), "pg_%u", subid);
-    originid = replorigin_by_name(originname, true);
-    if (originid != InvalidRepOriginId)
-        replorigin_drop(originid);
-
-    /* If the user asked to not drop the slot, we are done mow.*/
-    if (!stmt->drop_slot)
-    {
-        heap_close(rel, NoLock);
-        return;
-    }
-    /*
-     * Otherwise drop the replication slot at the publisher node using
-     * the replication connection.
+     * replorigin_drop can throw an exception. we must release
+     * LogicalRepLauncherLock for the case.     */
-    load_file("libpqwalreceiver", false);
+    PG_TRY();
+    {
+        /* Kill the apply worker so that the slot becomes accessible. */
+        logicalrep_worker_stop(subid);
-    initStringInfo(&cmd);
-    appendStringInfo(&cmd, "DROP_REPLICATION_SLOT \"%s\"", slotname);
+        /* Remove the origin tracking if exists. */
+        snprintf(originname, sizeof(originname), "pg_%u", subid);
+        originid = replorigin_by_name(originname, true);
+        if (originid != InvalidRepOriginId)
+            replorigin_drop(originid);
-    wrconn = walrcv_connect(conninfo, true, subname, &err);
-    if (wrconn == NULL)
-        ereport(ERROR,
-                (errmsg("could not connect to publisher when attempting to "
-                        "drop the replication slot \"%s\"", slotname),
-                 errdetail("The error was: %s", err)));
+        /* If the user asked to not drop the slot, we are done now.*/
+        if (stmt->drop_slot)
+        {
+            /*
+             * Otherwise drop the replication slot at the publisher node using
+             * the replication connection.
+             */
+            load_file("libpqwalreceiver", false);
+
+            initStringInfo(&cmd);
+            appendStringInfo(&cmd, "DROP_REPLICATION_SLOT \"%s\"", slotname);
+
+            wrconn = walrcv_connect(conninfo, true, subname, &err);
+            if (wrconn == NULL || !walrcv_command(wrconn, cmd.data, &err))
+            {
+                if (wrconn == NULL)
+                    ereport(ERROR,
+                            (errmsg("could not connect to publisher when "
+                                    "attempting to drop the "
+                                    "replication slot \"%s\"", slotname),
+                             errdetail("The error was: %s", err)));
-    if (!walrcv_command(wrconn, cmd.data, &err))
-        ereport(ERROR,
-                (errmsg("could not drop the replication slot \"%s\" on publisher",
-                        slotname),
-                 errdetail("The error was: %s", err)));
-    else
-        ereport(NOTICE,
-                (errmsg("dropped replication slot \"%s\" on publisher",
-                        slotname)));
+                ereport(ERROR,
+                        (errmsg("could not drop the replication slot \"%s\" on publisher",
+                                slotname),
+                         errdetail("The error was: %s", err)));
+            }
-    walrcv_disconnect(wrconn);
+            ereport(NOTICE,
+                    (errmsg("dropped replication slot \"%s\" on publisher",
+                            slotname)));
-    pfree(cmd.data);
+            pfree(cmd.data);
+        }
+    }
+    PG_CATCH();
+    {
+        LWLockRelease(LogicalRepLauncherLock);
+        PG_RE_THROW();
+    }
+    PG_END_TRY();
+
+    LWLockRelease(LogicalRepLauncherLock);
+
+    if (wrconn)
+        walrcv_disconnect(wrconn);    heap_close(rel, NoLock);}

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nikhil Sontakke
Дата:
Сообщение: Re: [HACKERS] Speedup twophase transactions
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: [HACKERS] IF (NOT) EXISTS in psql-completion