Обсуждение: libpq async duplicate error results

Поиск
Список
Период
Сортировка

libpq async duplicate error results

От
Peter Eisentraut
Дата:
This issue was discovered by Fabien in the SHOW_ALL_RESULTS thread.  I'm 
posting it here separately, because I think it ought to be addressed in 
libpq rather than with a workaround in psql, as proposed over there.

When using PQsendQuery() + PQgetResult() and the server crashes during 
the execution of the command, PQgetResult() then returns two result sets 
with partially duplicated error messages, like this from the attached 
test program:

command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server


This is hidden in normal use because PQexec() throws away all but the 
last result set, but the extra one is still generated internally.

Apparently, this has changed between PG13 and PG14.  In PG13 and 
earlier, the output is

command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: no connection to the server

In PG13, PQexec() concatenates all the error messages from multiple 
results, so a user of PQexec() sees the same output before and after. 
But for users of the lower-level APIs, things have become a bit more 
confusing.

Also, why are there multiple results being generated in the first place?


[0]: 
https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2112230703530.2668598@pseudo
Вложения

Re: libpq async duplicate error results

От
Fabien COELHO
Дата:
> command = SELECT pg_terminate_backend(pg_backend_pid());
> result 1 status = PGRES_FATAL_ERROR
> error message = "FATAL:  terminating connection due to administrator command
> "
> result 2 status = PGRES_FATAL_ERROR
> error message = "FATAL:  terminating connection due to administrator command
> server closed the connection unexpectedly
>     This probably means the server terminated abnormally
>     before or while processing the request.
> "

> Also, why are there multiple results being generated in the first place?

My interpretation is that the first message is a close message issued by 
the server before actually severing the connection, and the second message 
is generated by libpq when it notices that the connection has been closed, 
so there is some sense in having to results to report these two 
consecutive errors, and the question might be about when the buffer should 
be reset.

-- 
Fabien.



Re: libpq async duplicate error results

От
Peter Eisentraut
Дата:
On 26.01.22 14:52, Fabien COELHO wrote:
> 
>> command = SELECT pg_terminate_backend(pg_backend_pid());
>> result 1 status = PGRES_FATAL_ERROR
>> error message = "FATAL:  terminating connection due to administrator 
>> command
>> "
>> result 2 status = PGRES_FATAL_ERROR
>> error message = "FATAL:  terminating connection due to administrator 
>> command
>> server closed the connection unexpectedly
>>     This probably means the server terminated abnormally
>>     before or while processing the request.
>> "
> 
>> Also, why are there multiple results being generated in the first place?
> 
> My interpretation is that the first message is a close message issued by 
> the server before actually severing the connection, and the second 
> message is generated by libpq when it notices that the connection has 
> been closed, so there is some sense in having to results to report these 
> two consecutive errors, and the question might be about when the buffer 
> should be reset.

I see.  If we stipulate for a moment that getting two results is what we 
want here, I see a conceptual problem.  Users of PQexec() don't care 
whether the error message they are ultimately getting is a concatenation 
of two messages from two internal results (old behavior) or just an 
appropriately accumulated message from the last internal result (new 
behavior).  But users of PQgetResult() don't have a principled way to 
deal with this anymore.  They can't just ignore one or the other result, 
because they could after all be two separate errors from two separate 
commands.  So the only proper behavior is to show all errors from all 
results.  But that results in this misbehavior because the last error 
message must include the messages from the previous messages in order to 
accommodate PQexec() users.




Re: libpq async duplicate error results

От
Peter Eisentraut
Дата:
Tom, you worked on reorganzing the error message handling in libpq in 
PostgreSQL 14 (commit ffa2e4670123124b92f037d335a1e844c3782d3f).  Any 
thoughts on this?


On 25.01.22 09:32, Peter Eisentraut wrote:
> 
> This issue was discovered by Fabien in the SHOW_ALL_RESULTS thread.  I'm 
> posting it here separately, because I think it ought to be addressed in 
> libpq rather than with a workaround in psql, as proposed over there.
> 
> When using PQsendQuery() + PQgetResult() and the server crashes during 
> the execution of the command, PQgetResult() then returns two result sets 
> with partially duplicated error messages, like this from the attached 
> test program:
> 
> command = SELECT 'before';
> result 1 status = PGRES_TUPLES_OK
> error message = ""
> 
> command = SELECT pg_terminate_backend(pg_backend_pid());
> result 1 status = PGRES_FATAL_ERROR
> error message = "FATAL:  terminating connection due to administrator 
> command
> "
> result 2 status = PGRES_FATAL_ERROR
> error message = "FATAL:  terminating connection due to administrator 
> command
> server closed the connection unexpectedly
>      This probably means the server terminated abnormally
>      before or while processing the request.
> "
> 
> command = SELECT 'after';
> PQsendQuery() error: no connection to the server
> 
> 
> This is hidden in normal use because PQexec() throws away all but the 
> last result set, but the extra one is still generated internally.
> 
> Apparently, this has changed between PG13 and PG14.  In PG13 and 
> earlier, the output is
> 
> command = SELECT 'before';
> result 1 status = PGRES_TUPLES_OK
> error message = ""
> 
> command = SELECT pg_terminate_backend(pg_backend_pid());
> result 1 status = PGRES_FATAL_ERROR
> error message = "FATAL:  terminating connection due to administrator 
> command
> "
> result 2 status = PGRES_FATAL_ERROR
> error message = "server closed the connection unexpectedly
>      This probably means the server terminated abnormally
>      before or while processing the request.
> "
> 
> command = SELECT 'after';
> PQsendQuery() error: no connection to the server
> 
> In PG13, PQexec() concatenates all the error messages from multiple 
> results, so a user of PQexec() sees the same output before and after. 
> But for users of the lower-level APIs, things have become a bit more 
> confusing.
> 
> Also, why are there multiple results being generated in the first place?
> 
> 
> [0]: 
> https://www.postgresql.org/message-id/alpine.DEB.2.22.394.2112230703530.2668598@pseudo




Re: libpq async duplicate error results

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> Tom, you worked on reorganzing the error message handling in libpq in 
> PostgreSQL 14 (commit ffa2e4670123124b92f037d335a1e844c3782d3f).  Any 
> thoughts on this?

Ah, sorry, I'd not noticed this thread.

I concur with Fabien's analysis: we report the FATAL message from
the server during the first PQgetResult, and then the second call
discovers that the connection is gone and reports "server closed
the connection unexpectedly".  Those are two independent events
(in libpq's view anyway) so reporting them separately is correct,
or at least necessary unless you want to engage in seriously
major redesign and behavioral changes.

It is annoying that some of the text is duplicated in the second
report, but in the end that's cosmetic, and I'm not sure we can
improve it without breaking other things.  In particular, we
can't just think about what comes back in the PGgetResult calls,
we also have to consider what PQerrorMessage(conn) will report
afterwards.  So I don't think that resetting conn->errorMessage
during a PQgetResult series would be a good fix.

An idea that I don't have time to pursue right now is to track
how much of conn->errorMessage has been read out by PQgetResult,
and only report newly-added text in later PQgetResult calls of
a series, while PQerrorMessage would continue to report the
whole thing.  Buffer resets would occur where they do now.

It'd probably be necessary to make a user-visible PQgetResult
that becomes a wrapper around PQgetResultInternal, because
internal callers such as PQexecFinish will need the old behavior,
or at least some of them will.

            regards, tom lane



Re: libpq async duplicate error results

От
Fabien COELHO
Дата:
Hello Tom,

> I concur with Fabien's analysis: we report the FATAL message from
> the server during the first PQgetResult, and then the second call
> discovers that the connection is gone and reports "server closed
> the connection unexpectedly".  Those are two independent events
> (in libpq's view anyway) so reporting them separately is correct,
> or at least necessary unless you want to engage in seriously
> major redesign and behavioral changes.
>
> It is annoying that some of the text is duplicated in the second
> report, but in the end that's cosmetic, and I'm not sure we can
> improve it without breaking other things.  In particular, we
> can't just think about what comes back in the PGgetResult calls,
> we also have to consider what PQerrorMessage(conn) will report
> afterwards.  So I don't think that resetting conn->errorMessage
> during a PQgetResult series would be a good fix.
>
> An idea that I don't have time to pursue right now is to track
> how much of conn->errorMessage has been read out by PQgetResult,
> and only report newly-added text in later PQgetResult calls of
> a series, while PQerrorMessage would continue to report the
> whole thing.  Buffer resets would occur where they do now.
>
> It'd probably be necessary to make a user-visible PQgetResult
> that becomes a wrapper around PQgetResultInternal, because
> internal callers such as PQexecFinish will need the old behavior,
> or at least some of them will.

I agree that the message reset is not convincing, but it was the only way 
to get the expected behavior without changing the existing functions.

I see two paths:

(1) keep the duplicate message in this corner case.

(2) do the hocus pocus you suggest around PQerrorMessage and PQgetResult 
to work around the duplication, just for this corner case.

I'd tend to choose (1) to keep it simple, even if (2) is feasible.

-- 
Fabien.



Re: libpq async duplicate error results

От
Tom Lane
Дата:
[ cc'ing Alvaro for pipeline questions ]

Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> It is annoying that some of the text is duplicated in the second
>> report, but in the end that's cosmetic, and I'm not sure we can
>> improve it without breaking other things.  In particular, we
>> can't just think about what comes back in the PGgetResult calls,
>> we also have to consider what PQerrorMessage(conn) will report
>> afterwards.  So I don't think that resetting conn->errorMessage
>> during a PQgetResult series would be a good fix.
>> 
>> An idea that I don't have time to pursue right now is to track
>> how much of conn->errorMessage has been read out by PQgetResult,
>> and only report newly-added text in later PQgetResult calls of
>> a series, while PQerrorMessage would continue to report the
>> whole thing.  Buffer resets would occur where they do now.

> I agree that the message reset is not convincing, but it was the only way 
> to get the expected behavior without changing the existing functions.

I spent a bit more time poking at this issue.  I no longer like the
idea of reporting only the tail part of conn->errorMessage; I'm
afraid that would convert "edge cases sometimes return redundant text"
into "edge cases sometimes miss returning text at all", which is not
an improvement.

In any case, the particular example we're looking at here is connection
loss, which is not something that should greatly concern us as far as
pipeline semantics are concerned, because you're not getting any more
pipelined results anyway if that happens.  In the non-I/O-error case,
I see that PQgetResult does a hacky "resetPQExpBuffer(&conn->errorMessage)"
in between pipelined queries.  It seems like if there are real usability
issues in this area, they probably come from that not being well placed.
In particular, I wonder why that's done with the pqPipelineProcessQueue
call in the PGASYNC_IDLE stanza, yet not with the pqPipelineProcessQueue
call in the PGASYNC_READY stanza (where we're returning a PIPELINE_SYNC
result).  It kinda looks to me like maybe pqPipelineProcessQueue
ought to have the responsibility for doing that, rather than having
two out of the three call sites do it.  Also it would seem more natural
to associate it with the reinitialization that happens inside
pqPipelineProcessQueue.

            regards, tom lane



Re: libpq async duplicate error results

От
Alvaro Herrera
Дата:
On 2022-Feb-07, Tom Lane wrote:

> In any case, the particular example we're looking at here is connection
> loss, which is not something that should greatly concern us as far as
> pipeline semantics are concerned, because you're not getting any more
> pipelined results anyway if that happens.  In the non-I/O-error case,
> I see that PQgetResult does a hacky "resetPQExpBuffer(&conn->errorMessage)"
> in between pipelined queries.  It seems like if there are real usability
> issues in this area, they probably come from that not being well placed.
> In particular, I wonder why that's done with the pqPipelineProcessQueue
> call in the PGASYNC_IDLE stanza, yet not with the pqPipelineProcessQueue
> call in the PGASYNC_READY stanza (where we're returning a PIPELINE_SYNC
> result).  It kinda looks to me like maybe pqPipelineProcessQueue
> ought to have the responsibility for doing that, rather than having
> two out of the three call sites do it.  Also it would seem more natural
> to associate it with the reinitialization that happens inside
> pqPipelineProcessQueue.

Yeah, I agree that the placement of error message reset in pipelined
libpq is more ad-hoc to the testing I was doing than carefully
principled.  I didn't test changing it, but on a quick look your
proposed change seems reasonable to me.

-- 
Álvaro Herrera           39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Pensar que el espectro que vemos es ilusorio no lo despoja de espanto,
sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)



Re: libpq async duplicate error results

От
Tom Lane
Дата:
I wrote:
>>> An idea that I don't have time to pursue right now is to track
>>> how much of conn->errorMessage has been read out by PQgetResult,
>>> and only report newly-added text in later PQgetResult calls of
>>> a series, while PQerrorMessage would continue to report the
>>> whole thing.  Buffer resets would occur where they do now.

> I spent a bit more time poking at this issue.  I no longer like the
> idea of reporting only the tail part of conn->errorMessage; I'm
> afraid that would convert "edge cases sometimes return redundant text"
> into "edge cases sometimes miss returning text at all", which is not
> an improvement.

I figured out what seems like an adequately bulletproof solution to
this problem.  The thing that was scaring me before is that there are
code paths in which we'll discard a pending conn->result PGresult
(which could be an error) and make a new one; it wasn't entirely
clear that we could track which result represents what text.
The solution to that has two components:

1. Never advance the how-much-text-has-been-reported counter until
we are just about to return an error PGresult to the application.
This reduces the risk of missing or duplicated text to what seems
an acceptable level.  I found that pqPrepareAsyncResult can be used
to handle this logic -- it's only called at outer levels, and it
already has some state-updating behavior, so we'd already have bugs
if it were called improperly.

2. Don't make error PGresults at all until we reach pqPrepareAsyncResult.
This saves some cycles in the paths where we previously replaced one
error with another.  Importantly, it also means that we can now have
a bulletproof solution to the problem of how to return an error when
we get OOM on any attempt to make a PGresult.  The attached patch
includes a statically-declared PGresult that we can return in that
case.

It's not very practical to do #2 exactly, because in the case of
an error received from the server, we need to build a PGresult that
includes the broken-down error fields.  But all internally-generated
errors can be handled in this style.

This seems workable, and you'll notice it fixes the duplicated text
in the test case Andres was worried about.

Note that this patch is on top of the event-proc changes I posted
in [1].  Without that change, it's not very clear how to cope
with PGEVT_RESULTCREATE failure after we've already updated the
error status.

            regards, tom lane

[1] https://www.postgresql.org/message-id/3390587.1645035110%40sss.pgh.pa.us

diff --git a/contrib/test_decoding/expected/slot_creation_error.out
b/contrib/test_decoding/expected/slot_creation_error.out
index 321648c339..043bdae0a2 100644
--- a/contrib/test_decoding/expected/slot_creation_error.out
+++ b/contrib/test_decoding/expected/slot_creation_error.out
@@ -98,7 +98,6 @@ t

 step s2_init: <... completed>
 FATAL:  terminating connection due to administrator command
-FATAL:  terminating connection due to administrator command
 server closed the connection unexpectedly
     This probably means the server terminated abnormally
     before or while processing the request.
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
index f8f4111fef..6fceff561b 100644
--- a/src/interfaces/libpq/fe-auth.c
+++ b/src/interfaces/libpq/fe-auth.c
@@ -1237,7 +1237,7 @@ PQencryptPasswordConn(PGconn *conn, const char *passwd, const char *user,
     if (!conn)
         return NULL;

-    resetPQExpBuffer(&conn->errorMessage);
+    pqClearConnErrorState(conn);

     /* If no algorithm was given, ask the server. */
     if (algorithm == NULL)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9c9416e8ff..2a3d68b4d1 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3685,7 +3685,7 @@ keep_going:                        /* We will come back to here until there is
                  * (and it seems some clients expect it to be empty after a
                  * successful connection).
                  */
-                resetPQExpBuffer(&conn->errorMessage);
+                pqClearConnErrorState(conn);

                 /* We are open for business! */
                 conn->status = CONNECTION_OK;
@@ -4231,7 +4231,7 @@ closePGconn(PGconn *conn)

     /*
      * Close the connection, reset all transient state, flush I/O buffers.
-     * Note that this includes clearing conn->errorMessage; we're no longer
+     * Note that this includes clearing conn's error state; we're no longer
      * interested in any failures associated with the old connection, and we
      * want a clean slate for any new connection attempt.
      */
@@ -4241,7 +4241,7 @@ closePGconn(PGconn *conn)
     conn->xactStatus = PQTRANS_IDLE;
     conn->pipelineStatus = PQ_PIPELINE_OFF;
     pqClearAsyncResult(conn);    /* deallocate result */
-    resetPQExpBuffer(&conn->errorMessage);
+    pqClearConnErrorState(conn);
     release_conn_addrinfo(conn);

     /* Reset all state obtained from server, too */
@@ -5236,7 +5236,7 @@ ldapServiceLookup(const char *purl, PQconninfoOption *options,
  * Returns 0 on success, nonzero on failure.  On failure, if errorMessage
  * isn't null, also store an error message there.  (Note: the only reason
  * this function and related ones don't dump core on errorMessage == NULL
- * is the undocumented fact that printfPQExpBuffer does nothing when passed
+ * is the undocumented fact that appendPQExpBuffer does nothing when passed
  * a null PQExpBuffer pointer.)
  */
 static int
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index c7c48d07dc..ceeb0a4fd9 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -44,6 +44,13 @@ char       *const pgresStatus[] = {
     "PGRES_PIPELINE_ABORTED"
 };

+/* We return this if we're unable to make a PGresult at all */
+static const PGresult OOM_result = {
+    .resultStatus = PGRES_FATAL_ERROR,
+    .client_encoding = PG_SQL_ASCII,
+    .errMsg = "out of memory\n",
+};
+
 /*
  * static state needed by PQescapeString and PQescapeBytea; initialize to
  * values that result in backward-compatible behavior
@@ -141,6 +148,10 @@ static int    pqPipelineFlush(PGconn *conn);
  *     returns a newly allocated, initialized PGresult with given status.
  *     If conn is not NULL and status indicates an error, the conn's
  *     errorMessage is copied.  Also, any PGEvents are copied from the conn.
+ *
+ * Note: the logic to copy the conn's errorMessage is now vestigial;
+ * no internal caller uses it.  However, that behavior is documented for
+ * outside callers, so we'd better keep it.
  */
 PGresult *
 PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
@@ -191,7 +202,8 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
                 /* non-error cases */
                 break;
             default:
-                pqSetResultError(result, &conn->errorMessage);
+                /* we intentionally do not use or modify errorReported here */
+                pqSetResultError(result, &conn->errorMessage, 0);
                 break;
         }

@@ -657,9 +669,12 @@ pqResultStrdup(PGresult *res, const char *str)
 /*
  * pqSetResultError -
  *        assign a new error message to a PGresult
+ *
+ * Copy text from errorMessage buffer beginning at given offset
+ * (it's caller's responsibility that offset is valid)
  */
 void
-pqSetResultError(PGresult *res, PQExpBuffer errorMessage)
+pqSetResultError(PGresult *res, PQExpBuffer errorMessage, int offset)
 {
     char       *msg;

@@ -674,7 +689,7 @@ pqSetResultError(PGresult *res, PQExpBuffer errorMessage)
      * at a constant "out of memory" string.
      */
     if (!PQExpBufferBroken(errorMessage))
-        msg = pqResultStrdup(res, errorMessage->data);
+        msg = pqResultStrdup(res, errorMessage->data + offset);
     else
         msg = NULL;
     if (msg)
@@ -693,9 +708,14 @@ PQclear(PGresult *res)
     PGresult_data *block;
     int            i;

+    /* As a convenience, do nothing for a NULL pointer */
     if (!res)
         return;
+    /* Also, do nothing if the argument is OOM_result */
+    if (res == unconstify(PGresult *, &OOM_result))
+        return;

+    /* Close down any events we may have */
     for (i = 0; i < res->nEvents; i++)
     {
         /* only send DESTROY to successfully-initialized event procs */
@@ -748,24 +768,39 @@ pqClearAsyncResult(PGconn *conn)
     if (conn->result)
         PQclear(conn->result);
     conn->result = NULL;
+    conn->error_result = false;
     if (conn->next_result)
         PQclear(conn->next_result);
     conn->next_result = NULL;
 }

 /*
- * This subroutine deletes any existing async result, sets conn->result
- * to a PGresult with status PGRES_FATAL_ERROR, and stores the current
- * contents of conn->errorMessage into that result.
+ * pqSaveErrorResult -
+ *      remember that we have an error condition
+ *
+ * In much of libpq, reporting an error just requires appending text to
+ * conn->errorMessage and returning a failure code to one's caller.
+ * Where returning a failure code is impractical, instead call this
+ * function to remember that an error needs to be reported.
+ *
+ * (It might seem that appending text to conn->errorMessage should be
+ * sufficient, but we can't rely on that working under out-of-memory
+ * conditions.  The OOM hazard is also why we don't try to make a new
+ * PGresult right here.)
  */
 void
 pqSaveErrorResult(PGconn *conn)
 {
+    /* Drop any pending result ... */
     pqClearAsyncResult(conn);
-    conn->result = PQmakeEmptyPGresult(conn, PGRES_FATAL_ERROR);
+    /* ... and set flag to remember to make an error result later */
+    conn->error_result = true;
 }

 /*
+ * pqSaveWriteError -
+ *      report a write failure
+ *
  * As above, after appending conn->write_err_msg to whatever other error we
  * have.  This is used when we've detected a write failure and have exhausted
  * our chances of reporting something else instead.
@@ -792,24 +827,79 @@ pqSaveWriteError(PGconn *conn)
 }

 /*
- * This subroutine prepares an async result object for return to the caller.
+ * pqPrepareAsyncResult -
+ *      prepare the current async result object for return to the caller
+ *
  * If there is not already an async result object, build an error object
  * using whatever is in conn->errorMessage.  In any case, clear the async
- * result storage.
+ * result storage, and update our notion of how much error text has been
+ * returned to the application.
  */
 PGresult *
 pqPrepareAsyncResult(PGconn *conn)
 {
     PGresult   *res;

-    /*
-     * conn->result is the PGresult to return.  If it is NULL (which probably
-     * shouldn't happen) we assume there is an appropriate error message in
-     * conn->errorMessage.
-     */
     res = conn->result;
-    if (!res)
-        res = PQmakeEmptyPGresult(conn, PGRES_FATAL_ERROR);
+    if (res)
+    {
+        /*
+         * If the pre-existing result is an ERROR (presumably something
+         * received from the server), assume that it represents whatever is in
+         * conn->errorMessage, and advance errorReported.
+         */
+        if (res->resultStatus == PGRES_FATAL_ERROR)
+            conn->errorReported = conn->errorMessage.len;
+    }
+    else
+    {
+        /*
+         * We get here after internal-to-libpq errors.  We should probably
+         * always have error_result = true, but if we don't, gin up some error
+         * text.
+         */
+        if (!conn->error_result)
+            appendPQExpBufferStr(&conn->errorMessage,
+                                 libpq_gettext("no error text available\n"));
+
+        /* Paranoia: be sure errorReported offset is sane */
+        if (conn->errorReported < 0 ||
+            conn->errorReported >= conn->errorMessage.len)
+            conn->errorReported = 0;
+
+        /*
+         * Make a PGresult struct for the error.  We temporarily lie about the
+         * result status, so that PQmakeEmptyPGresult doesn't uselessly copy
+         * all of conn->errorMessage.
+         */
+        res = PQmakeEmptyPGresult(conn, PGRES_EMPTY_QUERY);
+        if (res)
+        {
+            /*
+             * Report whatever new error text we have, and advance
+             * errorReported.
+             */
+            res->resultStatus = PGRES_FATAL_ERROR;
+            pqSetResultError(res, &conn->errorMessage, conn->errorReported);
+            conn->errorReported = conn->errorMessage.len;
+        }
+        else
+        {
+            /*
+             * Ouch, not enough memory for a PGresult.  Fortunately, we have a
+             * card up our sleeve: we can use the static OOM_result.  Casting
+             * away const here is a bit ugly, but it seems best to declare
+             * OOM_result as const, in hopes it will be allocated in read-only
+             * storage.
+             */
+            res = unconstify(PGresult *, &OOM_result);
+
+            /*
+             * Don't advance errorReported.  Perhaps we'll be able to report
+             * the text later.
+             */
+        }
+    }

     /*
      * Replace conn->result with next_result, if any.  In the normal case
@@ -819,6 +909,7 @@ pqPrepareAsyncResult(PGconn *conn)
      */
     conn->result = conn->next_result;
     conn->next_result = NULL;
+    conn->error_result = false; /* next_result is never an error */

     return res;
 }
@@ -1278,7 +1369,7 @@ pqAppendCmdQueueEntry(PGconn *conn, PGcmdQueueEntry *entry)
              */
             if (conn->asyncStatus == PGASYNC_IDLE)
             {
-                resetPQExpBuffer(&conn->errorMessage);
+                pqClearConnErrorState(conn);
                 pqPipelineProcessQueue(conn);
             }
             break;
@@ -1626,10 +1717,10 @@ PQsendQueryStart(PGconn *conn, bool newQuery)
         return false;

     /*
-     * If this is the beginning of a query cycle, reset the error buffer.
+     * If this is the beginning of a query cycle, reset the error state.
      */
     if (newQuery)
-        resetPQExpBuffer(&conn->errorMessage);
+        pqClearConnErrorState(conn);

     /* Don't try to send if we know there's no live connection. */
     if (conn->status != CONNECTION_OK)
@@ -1687,8 +1778,8 @@ PQsendQueryStart(PGconn *conn, bool newQuery)

         /* reset single-row processing mode */
         conn->singleRowMode = false;
-
     }
+
     /* ready to send command message */
     return true;
 }
@@ -1884,7 +1975,7 @@ PQsetSingleRowMode(PGconn *conn)
         (conn->cmd_queue_head->queryclass != PGQUERY_SIMPLE &&
          conn->cmd_queue_head->queryclass != PGQUERY_EXTENDED))
         return 0;
-    if (conn->result)
+    if (conn->result || conn->error_result)
         return 0;

     /* OK, set flag */
@@ -2015,10 +2106,7 @@ PQgetResult(PGconn *conn)
             pqWait(true, false, conn) ||
             pqReadData(conn) < 0)
         {
-            /*
-             * conn->errorMessage has been set by pqWait or pqReadData. We
-             * want to append it to any already-received error message.
-             */
+            /* Report the error saved by pqWait or pqReadData */
             pqSaveErrorResult(conn);
             conn->asyncStatus = PGASYNC_IDLE;
             return pqPrepareAsyncResult(conn);
@@ -2053,7 +2141,7 @@ PQgetResult(PGconn *conn)
                  * is the start of the results of the next query, clear any
                  * prior error message.
                  */
-                resetPQExpBuffer(&conn->errorMessage);
+                pqClearConnErrorState(conn);
                 pqPipelineProcessQueue(conn);
             }
             break;
@@ -2117,7 +2205,9 @@ PQgetResult(PGconn *conn)
             appendPQExpBuffer(&conn->errorMessage,
                               libpq_gettext("unexpected asyncStatus: %d\n"),
                               (int) conn->asyncStatus);
-            res = PQmakeEmptyPGresult(conn, PGRES_FATAL_ERROR);
+            pqSaveErrorResult(conn);
+            conn->asyncStatus = PGASYNC_IDLE;    /* try to restore valid state */
+            res = pqPrepareAsyncResult(conn);
             break;
     }

@@ -2268,9 +2358,9 @@ PQexecStart(PGconn *conn)
     }

     /*
-     * Since this is the beginning of a query cycle, reset the error buffer.
+     * Since this is the beginning of a query cycle, reset the error state.
      */
-    resetPQExpBuffer(&conn->errorMessage);
+    pqClearConnErrorState(conn);

     /*
      * Silently discard any prior query result that application didn't eat.
@@ -2825,9 +2915,9 @@ PQfn(PGconn *conn,
         return NULL;

     /*
-     * Since this is the beginning of a query cycle, reset the error buffer.
+     * Since this is the beginning of a query cycle, reset the error state.
      */
-    resetPQExpBuffer(&conn->errorMessage);
+    pqClearConnErrorState(conn);

     if (conn->pipelineStatus != PQ_PIPELINE_OFF)
     {
@@ -2837,7 +2927,7 @@ PQfn(PGconn *conn,
     }

     if (conn->sock == PGINVALID_SOCKET || conn->asyncStatus != PGASYNC_IDLE ||
-        conn->result != NULL)
+        conn->result || conn->error_result)
     {
         appendPQExpBufferStr(&conn->errorMessage,
                              libpq_gettext("connection in wrong state\n"));
@@ -3707,9 +3797,9 @@ PQsetnonblocking(PGconn *conn, int arg)
      * behavior. this is ok because either they are making a transition _from_
      * or _to_ blocking mode, either way we can block them.
      *
-     * Clear errorMessage in case pqFlush adds to it.
+     * Clear error state in case pqFlush adds to it.
      */
-    resetPQExpBuffer(&conn->errorMessage);
+    pqClearConnErrorState(conn);

     /* if we are going from blocking to non-blocking flush here */
     if (pqFlush(conn))
@@ -3901,7 +3991,7 @@ PQescapeStringConn(PGconn *conn,
         return 0;
     }

-    resetPQExpBuffer(&conn->errorMessage);
+    pqClearConnErrorState(conn);

     return PQescapeStringInternal(conn, to, from, length, error,
                                   conn->client_encoding,
@@ -3939,7 +4029,7 @@ PQescapeInternal(PGconn *conn, const char *str, size_t len, bool as_ident)
     if (!conn)
         return NULL;

-    resetPQExpBuffer(&conn->errorMessage);
+    pqClearConnErrorState(conn);

     /* Scan the string for characters that must be escaped. */
     for (s = str; (s - str) < len && *s != '\0'; ++s)
@@ -4204,7 +4294,7 @@ PQescapeByteaConn(PGconn *conn,
     if (!conn)
         return NULL;

-    resetPQExpBuffer(&conn->errorMessage);
+    pqClearConnErrorState(conn);

     return PQescapeByteaInternal(conn, from, from_length, to_length,
                                  conn->std_strings,
diff --git a/src/interfaces/libpq/fe-lobj.c b/src/interfaces/libpq/fe-lobj.c
index 48399a90cb..075a5ed85b 100644
--- a/src/interfaces/libpq/fe-lobj.c
+++ b/src/interfaces/libpq/fe-lobj.c
@@ -665,8 +665,8 @@ lo_import_internal(PGconn *conn, const char *filename, Oid oid)
     if (conn == NULL)
         return InvalidOid;

-    /* Since this is the beginning of a query cycle, reset the error buffer */
-    resetPQExpBuffer(&conn->errorMessage);
+    /* Since this is the beginning of a query cycle, reset the error state */
+    pqClearConnErrorState(conn);

     /*
      * open the file to be read in
@@ -730,7 +730,8 @@ lo_import_internal(PGconn *conn, const char *filename, Oid oid)
         (void) lo_close(conn, lobj);
         (void) close(fd);
         /* deliberately overwrite any error from lo_close */
-        printfPQExpBuffer(&conn->errorMessage,
+        pqClearConnErrorState(conn);
+        appendPQExpBuffer(&conn->errorMessage,
                           libpq_gettext("could not read from file \"%s\": %s\n"),
                           filename,
                           strerror_r(save_errno, sebuf, sizeof(sebuf)));
@@ -785,7 +786,8 @@ lo_export(PGconn *conn, Oid lobjId, const char *filename)

         (void) lo_close(conn, lobj);
         /* deliberately overwrite any error from lo_close */
-        printfPQExpBuffer(&conn->errorMessage,
+        pqClearConnErrorState(conn);
+        appendPQExpBuffer(&conn->errorMessage,
                           libpq_gettext("could not open file \"%s\": %s\n"),
                           filename,
                           strerror_r(save_errno, sebuf, sizeof(sebuf)));
@@ -806,7 +808,8 @@ lo_export(PGconn *conn, Oid lobjId, const char *filename)
             (void) lo_close(conn, lobj);
             (void) close(fd);
             /* deliberately overwrite any error from lo_close */
-            printfPQExpBuffer(&conn->errorMessage,
+            pqClearConnErrorState(conn);
+            appendPQExpBuffer(&conn->errorMessage,
                               libpq_gettext("could not write to file \"%s\": %s\n"),
                               filename,
                               strerror_r(save_errno, sebuf, sizeof(sebuf)));
@@ -863,8 +866,8 @@ lo_initialize(PGconn *conn)
     if (conn == NULL)
         return -1;

-    /* Since this is the beginning of a query cycle, reset the error buffer */
-    resetPQExpBuffer(&conn->errorMessage);
+    /* Since this is the beginning of a query cycle, reset the error state */
+    pqClearConnErrorState(conn);

     /* Nothing else to do if we already collected info */
     if (conn->lobjfuncs != NULL)
diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 26dbeaed97..dc12a8b1f5 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -316,8 +316,9 @@ pqParseInput3(PGconn *conn)
                         return;
                     break;
                 case 'T':        /* Row Description */
-                    if (conn->result != NULL &&
-                        conn->result->resultStatus == PGRES_FATAL_ERROR)
+                    if (conn->error_result ||
+                        (conn->result != NULL &&
+                         conn->result->resultStatus == PGRES_FATAL_ERROR))
                     {
                         /*
                          * We've already choked for some reason.  Just discard
@@ -387,8 +388,9 @@ pqParseInput3(PGconn *conn)
                         if (getAnotherTuple(conn, msgLength))
                             return;
                     }
-                    else if (conn->result != NULL &&
-                             conn->result->resultStatus == PGRES_FATAL_ERROR)
+                    else if (conn->error_result ||
+                             (conn->result != NULL &&
+                              conn->result->resultStatus == PGRES_FATAL_ERROR))
                     {
                         /*
                          * We've already choked for some reason.  Just discard
@@ -966,10 +968,18 @@ pqGetErrorNotice3(PGconn *conn, bool isError)
      */
     if (isError)
     {
-        if (res)
-            pqSetResultError(res, &workBuf);
         pqClearAsyncResult(conn);    /* redundant, but be safe */
-        conn->result = res;
+        if (res)
+        {
+            pqSetResultError(res, &workBuf, 0);
+            conn->result = res;
+        }
+        else
+        {
+            /* Fall back to using the internal-error processing paths */
+            conn->error_result = true;
+        }
+
         if (PQExpBufferDataBroken(workBuf))
             appendPQExpBufferStr(&conn->errorMessage,
                                  libpq_gettext("out of memory\n"));
@@ -2116,10 +2126,33 @@ pqFunctionCall3(PGconn *conn, Oid fnid,
                     continue;
                 /* consume the message and exit */
                 conn->inStart += 5 + msgLength;
-                /* if we saved a result object (probably an error), use it */
-                if (conn->result)
-                    return pqPrepareAsyncResult(conn);
-                return PQmakeEmptyPGresult(conn, status);
+
+                /*
+                 * If we already have a result object (probably an error), use
+                 * that.  Otherwise, if we saw a function result message,
+                 * report COMMAND_OK.  Otherwise, the backend violated the
+                 * protocol, so complain.
+                 */
+                if (!(conn->result || conn->error_result))
+                {
+                    if (status == PGRES_COMMAND_OK)
+                    {
+                        conn->result = PQmakeEmptyPGresult(conn, status);
+                        if (!conn->result)
+                        {
+                            appendPQExpBufferStr(&conn->errorMessage,
+                                                 libpq_gettext("out of memory\n"));
+                            pqSaveErrorResult(conn);
+                        }
+                    }
+                    else
+                    {
+                        appendPQExpBuffer(&conn->errorMessage,
+                                          libpq_gettext("protocol error: no function result\n"));
+                        pqSaveErrorResult(conn);
+                    }
+                }
+                return pqPrepareAsyncResult(conn);
             case 'S':            /* parameter status */
                 if (getParameterStatus(conn))
                     continue;
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 4290553482..e0cee4b142 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -496,8 +496,17 @@ struct pg_conn
     PGdataValue *rowBuf;        /* array for passing values to rowProcessor */
     int            rowBufLen;        /* number of entries allocated in rowBuf */

-    /* Status for asynchronous result construction */
+    /*
+     * Status for asynchronous result construction.  If result isn't NULL, it
+     * is a result being constructed or ready to return.  If result is NULL
+     * and error_result is true, then we need to return a PGRES_FATAL_ERROR
+     * result, but haven't yet constructed it; text for the error has been
+     * appended to conn->errorMessage.  (Delaying construction simplifies
+     * dealing with out-of-memory cases.)  If next_result isn't NULL, it is a
+     * PGresult that will replace "result" after we return that one.
+     */
     PGresult   *result;            /* result being constructed */
+    bool        error_result;    /* do we need to make an ERROR result? */
     PGresult   *next_result;    /* next result (used in single-row mode) */

     /* Assorted state for SASL, SSL, GSS, etc */
@@ -567,8 +576,14 @@ struct pg_conn
      * Buffer for current error message.  This is cleared at the start of any
      * connection attempt or query cycle; after that, all code should append
      * messages to it, never overwrite.
+     *
+     * In some situations we might report an error more than once in a query
+     * cycle.  If so, errorMessage accumulates text from all the errors, and
+     * errorReported tracks how much we've already reported, so that the
+     * individual error PGresult objects don't contain duplicative text.
      */
     PQExpBufferData errorMessage;    /* expansible string */
+    int            errorReported;    /* # bytes of string already reported */

     /* Buffer for receiving various parts of messages */
     PQExpBufferData workBuffer; /* expansible string */
@@ -644,7 +659,7 @@ extern pgthreadlock_t pg_g_threadlock;

 /* === in fe-exec.c === */

-extern void pqSetResultError(PGresult *res, PQExpBuffer errorMessage);
+extern void pqSetResultError(PGresult *res, PQExpBuffer errorMessage, int offset);
 extern void *pqResultAlloc(PGresult *res, size_t nBytes, bool isBinary);
 extern char *pqResultStrdup(PGresult *res, const char *str);
 extern void pqClearAsyncResult(PGconn *conn);
@@ -830,6 +845,13 @@ extern void pqTraceOutputNoTypeByteMessage(PGconn *conn, const char *message);

 /* === miscellaneous macros === */

+/*
+ * Reset the conn's error-reporting state.
+ */
+#define pqClearConnErrorState(conn) \
+    (resetPQExpBuffer(&(conn)->errorMessage), \
+     (conn)->errorReported = 0)
+
 /*
  * this is so that we can check if a connection is non-blocking internally
  * without the overhead of a function call

Re: libpq async duplicate error results

От
Andres Freund
Дата:
Hi,

On 2022-02-16 18:51:37 -0500, Tom Lane wrote:
> This seems workable, and you'll notice it fixes the duplicated text
> in the test case Andres was worried about.

Cool.

I find it mildly scary that we didn't have any other tests verifying the libpq
side of connection termination. Seems like we we maybe should add a few more?
Some simple cases we can do via isolationtester. But some others would
probably require a C test program to be robust...


> +    /* Also, do nothing if the argument is OOM_result */
> +    if (res == unconstify(PGresult *, &OOM_result))
> +        return;

Wouldn't it make more sense to make res const, rather than unconstifying
&OOM_result?


Greetings,

Andres Freund



Re: libpq async duplicate error results

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-02-16 18:51:37 -0500, Tom Lane wrote:
>> +    /* Also, do nothing if the argument is OOM_result */
>> +    if (res == unconstify(PGresult *, &OOM_result))
>> +        return;

> Wouldn't it make more sense to make res const, rather than unconstifying
> &OOM_result?

Uh ... then we'd have to cast away the const to do free().

            regards, tom lane



Re: libpq async duplicate error results

От
Andres Freund
Дата:
Hi,

On 2022-02-16 20:28:02 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2022-02-16 18:51:37 -0500, Tom Lane wrote:
> >> +    /* Also, do nothing if the argument is OOM_result */
> >> +    if (res == unconstify(PGresult *, &OOM_result))
> >> +        return;
>
> > Wouldn't it make more sense to make res const, rather than unconstifying
> > &OOM_result?
>
> Uh ... then we'd have to cast away the const to do free().

I was just thinking of

if ((const PGresult *) res == &OOM_result)

It's not important, I just find it stylistically nicer (making a pointer const
from an non-const pointer is safe, the other way round not generally).

- Andres



Re: libpq async duplicate error results

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2022-02-16 20:28:02 -0500, Tom Lane wrote:
>> Uh ... then we'd have to cast away the const to do free().

> I was just thinking of
> if ((const PGresult *) res == &OOM_result)

Oh, I see.  Sure, can do it like that.

            regards, tom lane



Re: libpq async duplicate error results

От
Tom Lane
Дата:
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2022-Feb-07, Tom Lane wrote:
>> I see that PQgetResult does a hacky "resetPQExpBuffer(&conn->errorMessage)"
>> in between pipelined queries.  It seems like if there are real usability
>> issues in this area, they probably come from that not being well placed.
>> In particular, I wonder why that's done with the pqPipelineProcessQueue
>> call in the PGASYNC_IDLE stanza, yet not with the pqPipelineProcessQueue
>> call in the PGASYNC_READY stanza (where we're returning a PIPELINE_SYNC
>> result).  It kinda looks to me like maybe pqPipelineProcessQueue
>> ought to have the responsibility for doing that, rather than having
>> two out of the three call sites do it.  Also it would seem more natural
>> to associate it with the reinitialization that happens inside
>> pqPipelineProcessQueue.

> Yeah, I agree that the placement of error message reset in pipelined
> libpq is more ad-hoc to the testing I was doing than carefully
> principled.  I didn't test changing it, but on a quick look your
> proposed change seems reasonable to me.

Here's a patch to do that.  It passes check-world (particularly
src/test/modules/libpq_pipeline), for whatever that's worth.

However ... in the wake of 618c16707, I wonder if we should consider
an alternative definition, which is to NOT clear errorMessage when
starting a new pipelined query.  (That would be this patch minus
the addition to pqPipelineProcessQueue.)  Thanks to 618c16707,
the returned error PGresult(s) should bear exactly the same contents
either way.  What would change is that PQerrorMessage(conn) would return
the concatenation of all errors that have occurred during the current
pipeline sequence.  I think that's arguably a more useful definition
than what we have now --- though obviously it'd require a docs change.
It seems to fit with the spirit of the v14 changes to ensure that
PQerrorMessage tells you about everything that happened during a
failed connection attempt, not only the last thing.

I've tested that variant, and it also passes check-world, which may
say something about libpq_pipeline's coverage of error cases ...
I didn't look closely.

            regards, tom lane

diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 45dddaf556..495d2b4c13 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1380,10 +1380,7 @@ pqAppendCmdQueueEntry(PGconn *conn, PGcmdQueueEntry *entry)
              * state, we don't have to do anything.
              */
             if (conn->asyncStatus == PGASYNC_IDLE)
-            {
-                pqClearConnErrorState(conn);
                 pqPipelineProcessQueue(conn);
-            }
             break;
     }
 }
@@ -2149,11 +2146,8 @@ PQgetResult(PGconn *conn)
                 /*
                  * We're about to return the NULL that terminates the round of
                  * results from the current query; prepare to send the results
-                 * of the next query when we're called next.  Also, since this
-                 * is the start of the results of the next query, clear any
-                 * prior error message.
+                 * of the next query when we're called next.
                  */
-                pqClearConnErrorState(conn);
                 pqPipelineProcessQueue(conn);
             }
             break;
@@ -3099,6 +3093,9 @@ pqPipelineProcessQueue(PGconn *conn)
         conn->cmd_queue_head == NULL)
         return;

+    /* Reset the error state */
+    pqClearConnErrorState(conn);
+
     /* Initialize async result-accumulation state */
     pqClearAsyncResult(conn);


Re: libpq async duplicate error results

От
Tom Lane
Дата:
I wrote:
> However ... in the wake of 618c16707, I wonder if we should consider
> an alternative definition, which is to NOT clear errorMessage when
> starting a new pipelined query.  (That would be this patch minus
> the addition to pqPipelineProcessQueue.)  Thanks to 618c16707,
> the returned error PGresult(s) should bear exactly the same contents
> either way.  What would change is that PQerrorMessage(conn) would return
> the concatenation of all errors that have occurred during the current
> pipeline sequence.  I think that's arguably a more useful definition
> than what we have now --- though obviously it'd require a docs change.
> It seems to fit with the spirit of the v14 changes to ensure that
> PQerrorMessage tells you about everything that happened during a
> failed connection attempt, not only the last thing.

After studying the pipeline mode some more, I no longer think that's
a great idea.  ISTM we want pipeline mode to act as nearly as possible
the same as issuing the same series of queries non-pipelined.   But
this redefinition would make it different.  So, barring objection,
I'll push that cleanup patch as-posted.

            regards, tom lane



Re: libpq async duplicate error results

От
Tom Lane
Дата:
Actually ... it now seems to me that there's live bugs in the
interaction between pipeline mode and error-buffer clearing.
Namely, that places like PQsendQueryStart will unconditionally
clear the buffer even though we may be in the middle of a pipeline
sequence, and the buffer might hold an error that's yet to be
reported to the application.  So I think we need something
more like the attached.

            regards, tom lane

diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 45dddaf556..0c39bc9abf 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1380,10 +1380,7 @@ pqAppendCmdQueueEntry(PGconn *conn, PGcmdQueueEntry *entry)
              * state, we don't have to do anything.
              */
             if (conn->asyncStatus == PGASYNC_IDLE)
-            {
-                pqClearConnErrorState(conn);
                 pqPipelineProcessQueue(conn);
-            }
             break;
     }
 }
@@ -1730,8 +1727,10 @@ PQsendQueryStart(PGconn *conn, bool newQuery)

     /*
      * If this is the beginning of a query cycle, reset the error state.
+     * However, in pipeline mode with something already queued, the error
+     * buffer belongs to that command and we shouldn't clear it.
      */
-    if (newQuery)
+    if (newQuery && conn->cmd_queue_head == NULL)
         pqClearConnErrorState(conn);

     /* Don't try to send if we know there's no live connection. */
@@ -2149,11 +2148,8 @@ PQgetResult(PGconn *conn)
                 /*
                  * We're about to return the NULL that terminates the round of
                  * results from the current query; prepare to send the results
-                 * of the next query when we're called next.  Also, since this
-                 * is the start of the results of the next query, clear any
-                 * prior error message.
+                 * of the next query when we're called next.
                  */
-                pqClearConnErrorState(conn);
                 pqPipelineProcessQueue(conn);
             }
             break;
@@ -2362,6 +2358,14 @@ PQexecStart(PGconn *conn)
     if (!conn)
         return false;

+    /*
+     * Since this is the beginning of a query cycle, reset the error state.
+     * However, in pipeline mode with something already queued, the error
+     * buffer belongs to that command and we shouldn't clear it.
+     */
+    if (conn->cmd_queue_head == NULL)
+        pqClearConnErrorState(conn);
+
     if (conn->pipelineStatus != PQ_PIPELINE_OFF)
     {
         appendPQExpBufferStr(&conn->errorMessage,
@@ -2369,11 +2373,6 @@ PQexecStart(PGconn *conn)
         return false;
     }

-    /*
-     * Since this is the beginning of a query cycle, reset the error state.
-     */
-    pqClearConnErrorState(conn);
-
     /*
      * Silently discard any prior query result that application didn't eat.
      * This is probably poor design, but it's here for backward compatibility.
@@ -2928,8 +2927,11 @@ PQfn(PGconn *conn,

     /*
      * Since this is the beginning of a query cycle, reset the error state.
+     * However, in pipeline mode with something already queued, the error
+     * buffer belongs to that command and we shouldn't clear it.
      */
-    pqClearConnErrorState(conn);
+    if (conn->cmd_queue_head == NULL)
+        pqClearConnErrorState(conn);

     if (conn->pipelineStatus != PQ_PIPELINE_OFF)
     {
@@ -3099,6 +3101,12 @@ pqPipelineProcessQueue(PGconn *conn)
         conn->cmd_queue_head == NULL)
         return;

+    /*
+     * Reset the error state.  This and the next couple of steps correspond to
+     * what PQsendQueryStart didn't do for this query.
+     */
+    pqClearConnErrorState(conn);
+
     /* Initialize async result-accumulation state */
     pqClearAsyncResult(conn);

@@ -3809,9 +3817,11 @@ PQsetnonblocking(PGconn *conn, int arg)
      * behavior. this is ok because either they are making a transition _from_
      * or _to_ blocking mode, either way we can block them.
      *
-     * Clear error state in case pqFlush adds to it.
+     * Clear error state in case pqFlush adds to it, unless we're actively
+     * pipelining, in which case it seems best not to.
      */
-    pqClearConnErrorState(conn);
+    if (conn->cmd_queue_head == NULL)
+        pqClearConnErrorState(conn);

     /* if we are going from blocking to non-blocking flush here */
     if (pqFlush(conn))
@@ -4003,7 +4013,8 @@ PQescapeStringConn(PGconn *conn,
         return 0;
     }

-    pqClearConnErrorState(conn);
+    if (conn->cmd_queue_head == NULL)
+        pqClearConnErrorState(conn);

     return PQescapeStringInternal(conn, to, from, length, error,
                                   conn->client_encoding,
@@ -4041,7 +4052,8 @@ PQescapeInternal(PGconn *conn, const char *str, size_t len, bool as_ident)
     if (!conn)
         return NULL;

-    pqClearConnErrorState(conn);
+    if (conn->cmd_queue_head == NULL)
+        pqClearConnErrorState(conn);

     /* Scan the string for characters that must be escaped. */
     for (s = str; (s - str) < len && *s != '\0'; ++s)
@@ -4306,7 +4318,8 @@ PQescapeByteaConn(PGconn *conn,
     if (!conn)
         return NULL;

-    pqClearConnErrorState(conn);
+    if (conn->cmd_queue_head == NULL)
+        pqClearConnErrorState(conn);

     return PQescapeByteaInternal(conn, from, from_length, to_length,
                                  conn->std_strings,

Re: libpq async duplicate error results

От
Peter Eisentraut
Дата:
I took another look at this.  The output from the test program at the 
beginning of the thread is now (master branch):

command = SELECT 'before';
result 1 status = PGRES_TUPLES_OK
error message = ""

command = SELECT pg_terminate_backend(pg_backend_pid());
result 1 status = PGRES_FATAL_ERROR
error message = "FATAL:  terminating connection due to administrator command
"
result 2 status = PGRES_FATAL_ERROR
error message = "server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
"

command = SELECT 'after';
PQsendQuery() error: FATAL:  terminating connection due to administrator 
command
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
no connection to the server


It appears the "after" query is getting the error message from the 
previous cycle somehow.


The output in PG14 and PG13 is:

command = SELECT 'after';
PQsendQuery() error: no connection to the server


Is the change intended or do we need to think about more tweaking?



Re: libpq async duplicate error results

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
> I took another look at this.  The output from the test program at the 
> beginning of the thread is now (master branch):
> ...
> command = SELECT 'after';
> PQsendQuery() error: FATAL:  terminating connection due to administrator 
> command
> server closed the connection unexpectedly
>     This probably means the server terminated abnormally
>     before or while processing the request.
> no connection to the server

> It appears the "after" query is getting the error message from the 
> previous cycle somehow.

What is happening is that this bit in PQsendQueryStart is deciding
not to clear the message buffer because conn->cmd_queue_head isn't
NULL:

    /*
     * If this is the beginning of a query cycle, reset the error state.
     * However, in pipeline mode with something already queued, the error
     * buffer belongs to that command and we shouldn't clear it.
     */
    if (newQuery && conn->cmd_queue_head == NULL)
        pqClearConnErrorState(conn);

So apparently the fault is with the pipelined-query logic.  I'm
not sure what cleanup step is missing though.

            regards, tom lane



Re: libpq async duplicate error results

От
Tom Lane
Дата:
I wrote:
> What is happening is that this bit in PQsendQueryStart is deciding
> not to clear the message buffer because conn->cmd_queue_head isn't
> NULL:

>     /*
>      * If this is the beginning of a query cycle, reset the error state.
>      * However, in pipeline mode with something already queued, the error
>      * buffer belongs to that command and we shouldn't clear it.
>      */
>     if (newQuery && conn->cmd_queue_head == NULL)
>         pqClearConnErrorState(conn);

> So apparently the fault is with the pipelined-query logic.  I'm
> not sure what cleanup step is missing though.

I experimented with the attached patch, which is based on the idea
that clearing the command queue is being done in entirely the wrong
place.  It's more appropriate to do it where we drop unsent output
data, ie in pqDropConnection not pqDropServerData.  The fact that
it was jammed into the latter without any commentary doesn't do much
to convince me that that placement was thought about carefully.

This fixes the complained-of symptom and still passes check-world.
However, I wonder whether cutting the queue down to completely empty
is the right thing.  Why is the queue set up like this in the first
place --- that is, why don't we remove an entry the moment the
command is sent?  I also wonder whether pipelineStatus ought to
get reset here.  We aren't resetting asyncStatus here, so maybe it's
fine, but I'm not quite sure.

            regards, tom lane

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 4c12f1411f..6e936bbff3 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -377,6 +377,7 @@ static int    connectDBStart(PGconn *conn);
 static int    connectDBComplete(PGconn *conn);
 static PGPing internal_ping(PGconn *conn);
 static PGconn *makeEmptyPGconn(void);
+static void pqFreeCommandQueue(PGcmdQueueEntry *queue);
 static bool fillPGconn(PGconn *conn, PQconninfoOption *connOptions);
 static void freePGconn(PGconn *conn);
 static void closePGconn(PGconn *conn);
@@ -462,6 +463,12 @@ pqDropConnection(PGconn *conn, bool flushInput)
     /* Always discard any unsent data */
     conn->outCount = 0;

+    /* Likewise, discard any pending pipelined commands */
+    pqFreeCommandQueue(conn->cmd_queue_head);
+    conn->cmd_queue_head = conn->cmd_queue_tail = NULL;
+    pqFreeCommandQueue(conn->cmd_queue_recycle);
+    conn->cmd_queue_recycle = NULL;
+
     /* Free authentication/encryption state */
 #ifdef ENABLE_GSS
     {
@@ -569,12 +576,6 @@ pqDropServerData(PGconn *conn)
     }
     conn->notifyHead = conn->notifyTail = NULL;

-    pqFreeCommandQueue(conn->cmd_queue_head);
-    conn->cmd_queue_head = conn->cmd_queue_tail = NULL;
-
-    pqFreeCommandQueue(conn->cmd_queue_recycle);
-    conn->cmd_queue_recycle = NULL;
-
     /* Reset ParameterStatus data, as well as variables deduced from it */
     pstatus = conn->pstatus;
     while (pstatus != NULL)