RE: Libpq support to connect to standby server as priority

Поиск
Список
Период
Сортировка
От Smith, Peter
Тема RE: Libpq support to connect to standby server as priority
Дата
Msg-id f6daf26354054924998670cf03770352@G06AUEXCASMHB04.g06.fujitsu.local
обсуждение исходный текст
Ответ на Re: Libpq support to connect to standby server as priority  (Greg Nancarrow <gregn4422@gmail.com>)
Список pgsql-hackers
Hi Greg,

I have spent some time reading this discussion thread, and doing a code review of the latest (v17-0001) patch.

Below are my review comments; some are trivial, others not so much.

====

GENERAL COMMENT 1 ("any")

"any" should be included as valid option for target_server_type.

IIUC target_server_type was added mostly to have better alignment with JDBC options.
Both Vladimir [1] and Dave [2] already said that JDBC does have an "any" option.
[1] - https://www.postgresql.org/message-id/CAB%3DJe-FwOVE%3D8gR1UDDZRnWZR65fRG40e8zW_U_6mnUqbce68g%40mail.gmail.com
[2] - https://www.postgresql.org/message-id/CADK3HHJ9316ji7L-97cJBY%3Dwp4E3ddPMn8XdkNz6j8d9u0OhmQ%40mail.gmail.com

Furthermore, the fe-connect.c function makeEmptyPGConn sets default: 
+    conn->requested_server_type = SERVER_TYPE_ANY;
This means the default type of target_server_type is "any".
Since this is default, it should also be possible to assign the same value to explicitly.

(Parts of the v17 patch affected by this are itemised below)

====

GENERAL COMMENT 2 (Removal of pg_is_in_recovery)

Around 22/3/2019 Hari added a lot of pg_is_in_recovery code in his patch 0006 [1]
[1] - https://www.postgresql.org/message-id/CAJrrPGd4YeA%2BN%3DxC%2B1XPVoGzMCATJZY4irVQEJ6i0aPqorUi7g%40mail.gmail.com

Much later IIUC the latest v17 patch has taken onboard the recommendation from Alvaro and removed all that code:
"I would discard the whole thing about checking "SELECT pg_is_in_recovery()"" [2]
[2] - https://www.postgresql.org/message-id/20191227130828.GA21647%40alvherre.pgsql

However, it seems that not ALL parts of the original code got cleanly removed in v17.
There are a number of references to CONNECTION_CHECK_RECOVERY and pg_is_in_recovery still lurking.

(Parts of the v17 patch affected by this are itemised below)

====

COMMENT libpq.sgml (para blocks)

+       <para>

The v17 patch for target_session_attrs and target_server_type help is currently using <para> blocks for each of the
possibleoption values. 
 
This format is inconsistent document style with other variables in this SGML. 
Other places are using sub-lists for option values. e.g. look at "six modes" of sslmode.

====

COMMENT libpq.sgml (cut/paste parameter description)

I don't think that target_server_type help should be just a cut/paste duplicate of  target_session_attrs. It is
confusingbecause it leaves the reader doubting the purpose of having such a duplication.
 

Suggest to simplify the target_server_type help like as follows:
--
target_server_type
The purpose of this parameter is to reflect the similar PGJDBC targetServerType.
The supported options are same as target_session_attrs.
This parameter overrides any connection type specified by target_session_attrs.
--

====

COMMENT libpq.sgml (pg_is_in_recovery)

(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)

+       <para>
+        If this parameter is set to <literal>standby</literal>, only a connection in which
+        the server is in recovery mode is considered acceptable. If the server is prior to version 14,
+        the query <literal>SELECT pg_is_in_recovery()</literal> will be sent upon any successful
+        connection; if it returns <literal>t</literal>, it means the server is in recovery mode.
+       </para>

Suggest change to:
--
If this parameter is set to <literal>standby</literal>, only a connection in which the server is in recovery mode is
consideredacceptable. The recovery mode state is determined by the value of the in_recovery configuration parameter
thatis reported by the server upon successful connection. Otherwise, if the server is prior to version 14, only a
connectionin which read-write transactions are not accepted by default is considered acceptable. To determine whether
theserver supports read-write transactions, the query SHOW transaction_read_only will be sent upon any successful
connection;if it returns on, it means the server doesn't support read-write transactions.
 
--

====

COMMENT libpq.sgml (Oxford comma)

+       <varname>integer_datetimes</varname>,
+       <varname>standard_conforming_strings</varname> and
+       <varname>in_recovery</varname>.

Previously there was an Oxford comma (e.g. before the "and"). Now there isn't.
The v17 patch should not alter the previous listing style.

====

COMMENT protocol.sgml (Oxford comma)

+    <varname>integer_datetimes</varname>,
+    <varname>standard_conforming_strings</varname> and
+    <varname>in_recovery</varname>.

Previously there was an Oxford comma (e.g. before the "and"). Now there isn't.
The v17 patch should not alter the previous listing style.

====

QUESTION standby.c - SendRecoveryExitSignal

+/*
+ * SendRecoveryExitSignal
+ *        Signal backends that the server has exited recovery mode.
+ */
+void
+SendRecoveryExitSignal(void)
+{
+    SendSignalToAllBackends(PROCSIG_RECOVERY_EXIT);
+}

I wonder if this function is really necessary?
IIUC the SendRecoveryExitSignal is only called from one place (xlog.c).
Why not just call SendSignalToAllBackends directly from there and remove this extra layer?

====

COMMENT postgres.c (signal comment)

+    /* signal that work needs to be done */
+    recoveryExitInterruptPending = true;

Suggest change comment to say: 
/* flag that work needs to be done */

====

COMMENT fe-connect.c (sizeof)

-        "Target-Session-Attrs", "", 11, /* sizeof("read-write") = 11 */
+        "Target-Session-Attrs", "", 15, /* sizeof("prefer-standby") = 15 */

According to the SGML "prefer-secondary" is also an acceptable value for target_session_attrs, so the display field
widthshould be 17 /* sizeof("prefer-secondary") */ not 15.
 

====

COMMENT fe-connect.c (CONNECTION_CHECK_RECOVERY)

@@ -2310,6 +2461,7 @@ PQconnectPoll(PGconn *conn)
         case CONNECTION_CHECK_WRITABLE:
         case CONNECTION_CONSUME:
         case CONNECTION_GSS_STARTUP:
+        case CONNECTION_CHECK_RECOVERY:
             break;

(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)

Probably this CONNECTION_CHECK_RECOVERY case should be removed.

====

COMMENT fe-connect.c - function validateAndRecordTargetServerType

As noted in GENERAL COMMENT 1, I suggest "any" needs to be included in this function as a valid option.

====

COMMENT fe-connect.c (target_session_attrs validation)

@@ -1396,8 +1425,9 @@ connectOptions2(PGconn *conn)
      */
     if (conn->target_session_attrs)
     {
-        if (strcmp(conn->target_session_attrs, "any") != 0
-            && strcmp(conn->target_session_attrs, "read-write") != 0)
+        if (strcmp(conn->target_session_attrs, "any") == 0)
+            conn->requested_server_type = SERVER_TYPE_ANY;
+        else if (!validateAndRecordTargetServerType(conn->target_session_attrs, &conn->requested_server_type))

I suggest introducing a 2nd function for target_session_attrs (e.g. validateAndRecordTargetSessionAttrs). 
Even though these parameters are functionally the same today, in future they may not be [1].
[1] - https://www.postgresql.org/message-id/20200109152539.GA29017%40alvherre.pgsql

Regardless, the special "any" handling can be removed from here because (from GENERAL COMMENT 1) the
validateAndRecordTargetServerTypeshould now accept "any".
 

====

COMMENT fe-connect.c (message typo)

Found an existing typo, unrelated to the v17 patch.

"target_settion_attrs", --> "target_session_attrs",

====

COMMENT fe-connect.c (libpq_gettext)

+            printfPQExpBuffer(&conn->errorMessage,
+                              libpq_gettext("invalid target_server_type value: \"%s\"\n"),
+                              conn->target_server_type);

The parameter name "target_server_type" should be separated from the format string as "%s", the same as is done by the
libpq_gettextof the preceding code.
 

====

COMMENT fe-connect.c (indentation)

+                goto error_return;
+            }
         }
+        else
         conn->whichhost++;

Bad indentation of the else's statement.

====

COMMENT fe-connect.c (if/else complexity)

+                    else if ((conn->in_recovery &&
+                          conn->requested_server_type == SERVER_TYPE_PRIMARY) ||
+                         (!conn->in_recovery &&
+                          (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY ||
+                           conn->requested_server_type == SERVER_TYPE_STANDBY)))
+                    {

TBH I was unable to read this code without first drawing up a matrix of combinations to deduce what was going on. 
It should not be so inscrutable.

Suggestion1:
Consider putting a large comment at the top of this CONNECTION_CHECK_TARGET to give the overview what this code is
tryingto acheive. 
 

e.g. something like this:
---
Mode           |in_recovery |version < 7.4      |version < 14               |version >= 14
---------------+------------+-------------------+---------------------------+-------------
ANY            |NA          |OK                 |OK                         |OK
PRIMARY        |true        |OK                 |SHOW transaction_read_only |keep_going
PRIMARY        |false       |OK                 |SHOW transaction_read_only |OK
PREFER_STANDBY |true        |keep_going (or -2)    |SHOW transaction_read_only    |OK
PREFER_STANDBY |false       |keep_going (or -2) |SHOW transaction_read_only    |keep_going (or -2)
STANDBY        |true        |keep_going         |SHOW transaction_read_only    |OK
STANDBY        |false       |keep_going         |SHOW transaction_read_only    |keep_going
---

Suggestion2:
Consider to separate out the requested_server_type cases instead of trying to hand everything in the same else block.
Thecode may be a bit longer, but by aligning it more closely with the SGML documentation it can be made easier to
understand.

e.g. something like this:
---
if (conn->requested_server_type == SERVER_TYPE_PRIMARY) {
    /* If not-in-recovery, reject, else OK. */
    if (conn->in_recovery) {
        rejectCheckedRecoveryConnection(conn);
        goto keep_going;
    }
    goto consume_checked_target_connection;
}
 
if (conn->requested_server_type == SERVER_TYPE_STANDBY) {
    /* Only a connection in recovery mode is acceptable. */
    if (!conn->in_recovery) {
        rejectCheckedRecoveryConnection(conn);
        goto keep_going;
    }
    goto consume_checked_target_connection;
}
 
if (conn->requested_server_type == SERVER_TYPE_PREFER_STANDBY) {
    /* A connection in recovery mode is preferred. */
    if (conn->in_recovery)
        goto consume_checked_target_connection;
 
    /*
     * The following scenario is possible only for the
     * prefer-standby type for the next pass of the list
     * of connections, as it couldn't find any servers that
     * are in recovery.
     */
    if (conn->which_rw_host == -2)
        goto consume_checked_target_connection;
 
    /* reject function below remembers this r/w host index in case it is needed later */
    rejectCheckedRecoveryConnection(conn);
    goto keep_going;
}
---

====

COMMENT fe-connect.c (case CONNECTION_CHECK_RECOVERY)

(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)

v17 patch has removed the previous call to pg_is_in_recovery.
IIUC this means that there is currently no way for the remaining CONNECTION_CHECK_RECOVERY case to even be executed.

If I am correct, then a significant slab of code (~100 lines) can be deleted.
See case CONNECTION_CHECK_RECOVERY (lines ~ 4007 thru 4110)

====

COMMENT fe-connect.c - function  freePGConn (missing free?)

There is code to free(conn->target_session_attrs), but there is no code to free target_server_type. 
Appears to be accidental omission.

====

COMMENT fe-exec.c (altered comment)

-     * Special hacks: remember client_encoding and
+     * Special hacks: remember client_encoding, and

A comma was added. 
Suggest avoid altering comments not directly related to the v17 patch logic.

====

COMMENT libpq-fe.h (CONNECTION_CHECK_RECOVERY)

+    CONNECTION_CHECK_TARGET,    /* Check if we have a proper target connection */
+    CONNECTION_CHECK_RECOVERY    /* Check whether server is in recovery */

(As noted in GENERAL COMMENT 2 there are still residual references to pg_is_in_recovery)

Probably this CONNECTION_CHECK_RECOVERY case should be removed.

====

Kind Regards,
Peter Smith
---
Fujitsu Australia

В списке pgsql-hackers по дате отправления:

Предыдущее
От: James Coleman
Дата:
Сообщение: Re: remove spurious CREATE INDEX CONCURRENTLY wait
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks