Re: FATAL: could not send end-of-streaming message to primary: no COPY in progress

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: FATAL: could not send end-of-streaming message to primary: no COPY in progress
Дата
Msg-id CAHGQGwHvzV2J0QodA8x1xCx3CbaBmJTveQeoLFzX8hq5G25jEA@mail.gmail.com
обсуждение исходный текст
Ответ на FATAL: could not send end-of-streaming message to primary: no COPY in progress  (Thomas Munro <thomas.munro@enterprisedb.com>)
Ответы Re: FATAL: could not send end-of-streaming message to primary: no COPY in progress  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Список pgsql-hackers
On Thu, Mar 31, 2016 at 9:15 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Hi hackers,
>
> If you shut down a primary server, a standby that is streaming from it says54:
>
> LOG:  replication terminated by primary server
> DETAIL:  End of WAL reached on timeline 1 at 0/14F4B68.
> FATAL:  could not send end-of-streaming message to primary: no COPY in progress
>
> Isn't that FATAL ereport a bug?

ISTM that the cause is that walsender exits and replication connection is
closed just after "COPY 0" is sent. That is, then after receiving "COPY 0",
walreceiver tries to send an end-of-copy message to the primary, but fails
because the connection has been already closed.

> How is clean server shutdown supposed to work?

One option is to make walsender wait for end-of-copy message from walreceiver
before it closes the connection and exits, after sending "COPY 0" message.
But one question is; how should walsender behave when walreceiver gets stuck
and cannot reply an end-of-copy message to walsender? Probably we need
the timeout (maybe we can use wal_sender_timeout here but not sure yet
if it's appropriate or not).

Another option is to prevent walreceiver from sending an end-of-copy message.
If "COPY 0" always means the exit of walsender and the termination of
the connection, there seems to be no need to send back an end-of-copy message.
I've not checked yet how this interferes with other replication logics, though.

Regards,

-- 
Fujii Masao



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fujii Masao
Дата:
Сообщение: Re: brin_summarize_new_values error checking
Следующее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: Support for N synchronous standby servers - take 2