Re: Exit walsender before confirming remote flush in logical replication

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Exit walsender before confirming remote flush in logical replication
Дата
Msg-id 20230202.133423.283791550224495611.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Exit walsender before confirming remote flush in logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Exit walsender before confirming remote flush in logical replication  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
At Wed, 1 Feb 2023 14:58:14 +0530, Amit Kapila <amit.kapila16@gmail.com> wrote in 
> On Wed, Feb 1, 2023 at 2:09 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Otherwise, we will end up terminating
> > the WAL stream without the done message. Which will lead to an error
> > message "ERROR:  could not receive data from WAL stream: server closed
> > the connection unexpectedly" on the subscriber even at a clean
> > shutdown.
> >
> 
> But will that be a problem? As per docs of shutdown [1] ( “Smart” mode
> disallows new connections, then waits for all existing clients to
> disconnect. If the server is in hot standby, recovery and streaming
> replication will be terminated once all clients have disconnected.),
> there is no such guarantee. I see that it is required for the
> switchover in physical replication to ensure that all the WAL is sent
> and replicated but we don't need that for logical replication.

+1

Since publisher is not aware of apply-delay (by this patch), as a
matter of fact publisher seems gone before sending EOS in that
case. The error message is correctly describing that situation.

> > In a case where pq_is_send_pending() doesn't become false
> > for a long time, (e.g., the network socket buffer got full due to the
> > apply worker waiting on a lock), I think users should unblock it by
> > themselves. Or it might be practically better to shutdown the
> > subscriber first in the logical replication case, unlike the physical
> > replication case.
> >
> 
> Yeah, will users like such a dependency? And what will they gain by doing so?

If PostgreSQL required such kind of special care about shutdown at
facing a trouble to keep replication consistency, that won't be
acceptable. The current time-delayed logical replication can be seen
as a kind of intentional continuous large network lag in this
aspect. And I think the consistency is guaranteed even in such cases.

On the other hand I don't think the almost all people care about the
exact progress when facing such troubles, as far as replication
consistently is maintained. IMHO that is also true for the
logical-delayed-replication case.

> [1] - https://www.postgresql.org/docs/devel/app-pg-ctl.html

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: Weird failure with latches in curculio on v15
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Fix GUC_NO_SHOW_ALL test scenario in 003_check_guc.pl