Re: Making WAL receiver startup rely on GUC context forprimary_conninfo and primary_slot_name

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Making WAL receiver startup rely on GUC context forprimary_conninfo and primary_slot_name
Дата
Msg-id 20190111231007.GA24889@paquier.xyz
обсуждение исходный текст
Ответ на Re: Making WAL receiver startup rely on GUC context forprimary_conninfo and primary_slot_name  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: Making WAL receiver startup rely on GUC context forprimary_conninfo and primary_slot_name  (Michael Paquier <michael@paquier.xyz>)
Список pgsql-hackers
On Fri, Jan 11, 2019 at 12:52:08PM -0500, Robert Haas wrote:
> With the patch, the PrimaryConnInfo and PrimarySlotName arguments are
> removed from RequestXLogStreaming.  That means that the new
> walreceiver could come to a different conclusion about the values of
> those arguments than the startup process.  In particular, it could end
> up thinking that primary_conninfo is empty when, if the startup
> process had thought that, the walreceiver would never have been
> launched in the first place.  But it's not obvious that you've added
> any logic in WALReceiverMain or elsewhere to compensate for this
> possibility -- what would happen in that case?  Would we crash?
> Connect to the wrong server?

If I contemplate the patch this morning there is this bit:
@@ -291,32 +295,40 @@ WalReceiverMain(void)
    /* Unblock signals (they were blocked when the postmaster forked
    us) */
        PG_SETMASK(&UnBlockSig);

+   /*
+    * Fail immediately if primary_conninfo goes missing, better safe than
+    * sorry.
+    */
+   if (PrimaryConnInfo == NULL || strcmp(PrimaryConnInfo, "") == 0)
+       ereport(ERROR,
+               (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+                errmsg("cannot connect to the primary server as \"primary_conninfo\" is not defined")));

So the answer to your question is: the WAL receiver fails to start.

> I might be wrong here, but I'm inclined to think that this scenario
> hasn't really been contemplated carefully by the patch authors.  There
> are no added TAP tests for the scenario where the values differ
> between the two processes, no code comments which explain why it's OK
> if that happens, really no mention of it in the patch at all.  And on
> that basis I'm inclined to think that Andres is really quite correct
> to be worried about this.  The problem he's talking about here is very
> low-probability because the race condition is narrow, but it's real,
> and it surely needs to be handled somehow.

primary_conninfo and primary_slot_name are PGC_POSTMASTER now, so
adding tests now don't really make sense.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andrew Gierth
Дата:
Сообщение: declaration-after-statement (was Re: Ryu floating point output patch)
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Three animals fail test-decoding-check on REL_10_STABLE