Re: [HACKERS] walsender termination error messages worse in v10

Поиск
Список
Период
Сортировка
От Petr Jelinek
Тема Re: [HACKERS] walsender termination error messages worse in v10
Дата
Msg-id 2d132a85-1043-ba5f-3e23-131eb279df14@2ndquadrant.com
обсуждение исходный текст
Ответ на [HACKERS] walsender termination error messages worse in v10  (Andres Freund <andres@anarazel.de>)
Ответы Re: [HACKERS] walsender termination error messages worse in v10  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Hi Andres,

On 02/06/17 20:51, Andres Freund wrote:
> Hi,
> 
> commit 7c4f52409a8c7d85ed169bbbc1f6092274d03920
> Author: Peter Eisentraut <peter_e@gmx.net>
> Date:   2017-03-23 08:36:36 -0400
> 
>     Logical replication support for initial data copy
> 
> made walreceiver emit worse messages in v10 than before when the master
> gets shut down.  Before 10 you'll get something like:
> 
> 2017-06-02 11:46:07.173 PDT [16143][] LOG:  replication terminated by primary server
> 2017-06-02 11:46:07.173 PDT [16143][] DETAIL:  End of WAL reached on timeline 1 at 0/1B7FB8C8.
> 2017-06-02 11:46:07.173 PDT [16143][] FATAL:  could not send end-of-streaming message to primary: no COPY in
progress
> 
> the last bit is a bit superflous, but it still kinda makes sense.  Now
> you get:
> 2017-06-02 11:47:09.362 PDT [17061][] FATAL:  unexpected result after CommandComplete: server closed the connection
unexpectedly
>                 This probably means the server terminated abnormally
>                 before or while processing the request.
> 
> the reason is that
> 
> static int
> libpqrcv_receive(char **buffer, pgsocket *wait_fd)
> {
> 
> previously did:
>     if (rawlen == -1)           /* end-of-streaming or error */
>     {
>         PGresult   *res;
> 
>         res = PQgetResult(streamConn);
>         if (PQresultStatus(res) == PGRES_COMMAND_OK ||
>             PQresultStatus(res) == PGRES_COPY_IN)
>         {
>             PQclear(res);
>             return -1;
>         }
>         else
>         {
>             PQclear(res);
>             ereport(ERROR,
>                     (errmsg("could not receive data from WAL stream: %s",
>                             PQerrorMessage(streamConn))));
>         }
>     }
> 
> and now does
> 
>     if (rawlen == -1)           /* end-of-streaming or error */
>     {
>         PGresult   *res;
> 
>         res = PQgetResult(conn->streamConn);
>         if (PQresultStatus(res) == PGRES_COMMAND_OK)
>         {
>             PQclear(res);
> 
>             /* Verify that there are no more results */
>             res = PQgetResult(conn->streamConn);
>             if (res != NULL)
>                 ereport(ERROR,
>                         (errmsg("unexpected result after CommandComplete: %s",
>                                 PQerrorMessage(conn->streamConn))));
>             return -1;
>         }
>         else if (PQresultStatus(res) == PGRES_COPY_IN)
>         {
>             PQclear(res);
>             return -1;
>         }
>         else
>         {
>             PQclear(res);
>             ereport(ERROR,
>                     (errmsg("could not receive data from WAL stream: %s",
>                             pchomp(PQerrorMessage(conn->streamConn)))));
>         }
>     }
> 
> note the new /* Verify that there are no more results */ bit.
> 
> I don't understand why the new block is there, nor does the commit
> message explain it.
> 

Hmm, that particular change can actually be reverted. It was needed for
one those custom replication commands which were replaced by normal
query support. I have missed it during the rewrite.

--  Petr Jelinek                  http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training &
Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Petr Jelinek
Дата:
Сообщение: Re: [HACKERS] Why does logical replication launcher setapplication_name?
Следующее
От: Jim Nasby
Дата:
Сообщение: Re: [HACKERS] PG 10 release notes