Обсуждение: Add WALRCV_CONNECTING state to walreceiver

Поиск
Список
Период
Сортировка

Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi Hackers,

Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
does not accurately reflect streaming health.  In that discussion,
Noah noted that even before the reported regression, status =
'streaming' was unreliable because walreceiver sets it during early
startup, before attempting a connection. He suggested:

"Long-term, in master only, perhaps we should introduce another status
like 'connecting'. Perhaps enact the connecting->streaming status
transition just before tendering the first byte of streamed WAL to the
startup process. Alternatively, enact that transition when the startup
process accepts the
first streamed byte."

Michael and I also thought this could be a useful addition. This patch
implements that suggestion by adding a new WALRCV_CONNECTING state.

== Background ==
Currently, walreceiver transitions directly from STARTING to STREAMING
early in WalReceiverMain(), before any WAL data has been received.
This means status = 'streaming' can be observed even when:

- The connection to the primary has not been established
- No WAL data has actually been received or flushed

This makes it difficult for monitoring tools to distinguish between a
healthy streaming replica and one that is merely attempting to stream.

== Proposal ==

Introduce WALRCV_CONNECTING as an intermediate state between STARTING
and STREAMING:

- When walreceiver starts, it enters CONNECTING (instead of going
directly to STREAMING).
- The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
existing spinlock-protected block that updates flushedUpto.

Feedbacks welcome.

[1] https://www.postgresql.org/message-id/flat/19093-c4fff49a608f82a0%40postgresql.org

--
Best,
Xuneng

Вложения

Re: Add WALRCV_CONNECTING state to walreceiver

От
Noah Misch
Дата:
On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote:
> Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
> does not accurately reflect streaming health.  In that discussion,
> Noah noted that even before the reported regression, status =
> 'streaming' was unreliable because walreceiver sets it during early
> startup, before attempting a connection. He suggested:
> 
> "Long-term, in master only, perhaps we should introduce another status
> like 'connecting'. Perhaps enact the connecting->streaming status
> transition just before tendering the first byte of streamed WAL to the
> startup process. Alternatively, enact that transition when the startup
> process accepts the
> first streamed byte."

> == Proposal ==
> 
> Introduce WALRCV_CONNECTING as an intermediate state between STARTING
> and STREAMING:
> 
> - When walreceiver starts, it enters CONNECTING (instead of going
> directly to STREAMING).
> - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
> existing spinlock-protected block that updates flushedUpto.

I think this has the drawback that if the primary's WAL is incompatible,
e.g. unacceptable timeline, the walreceiver will still briefly enter
STREAMING.  That could trick monitoring.  Waiting for applyPtr to advance
would avoid the short-lived STREAMING.  What's the feasibility of that?



Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi Noah,

On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
>
> On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote:
> > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
> > does not accurately reflect streaming health.  In that discussion,
> > Noah noted that even before the reported regression, status =
> > 'streaming' was unreliable because walreceiver sets it during early
> > startup, before attempting a connection. He suggested:
> >
> > "Long-term, in master only, perhaps we should introduce another status
> > like 'connecting'. Perhaps enact the connecting->streaming status
> > transition just before tendering the first byte of streamed WAL to the
> > startup process. Alternatively, enact that transition when the startup
> > process accepts the
> > first streamed byte."
>
> > == Proposal ==
> >
> > Introduce WALRCV_CONNECTING as an intermediate state between STARTING
> > and STREAMING:
> >
> > - When walreceiver starts, it enters CONNECTING (instead of going
> > directly to STREAMING).
> > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
> > existing spinlock-protected block that updates flushedUpto.
>
> I think this has the drawback that if the primary's WAL is incompatible,
> e.g. unacceptable timeline, the walreceiver will still briefly enter
> STREAMING.  That could trick monitoring.

Thanks for pointing this out.

 Waiting for applyPtr to advance
> would avoid the short-lived STREAMING.  What's the feasibility of that?

I think this could work, but with complications. If replay latency is
high or replay is paused with pg_wal_replay_pause, the WalReceiver
would stay in the CONNECTING state longer than expected. Whether this
is ok depends on the definition of the 'connecting' state. For the
implementation, deciding where and when to check applyPtr against LSNs
like receiveStart is more difficult—the WalReceiver doesn't know when
applyPtr advances. While the WalReceiver can read applyPtr from shared
memory, it isn't automatically notified when that pointer advances.
This leads to latency between checking and replay if this is done in
the WalReceiver part unless we let the startup process set the state,
which would couple the two components. Am I missing something here?

--
Best,
Xuneng



Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi,

On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi Noah,
>
> On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
> >
> > On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote:
> > > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
> > > does not accurately reflect streaming health.  In that discussion,
> > > Noah noted that even before the reported regression, status =
> > > 'streaming' was unreliable because walreceiver sets it during early
> > > startup, before attempting a connection. He suggested:
> > >
> > > "Long-term, in master only, perhaps we should introduce another status
> > > like 'connecting'. Perhaps enact the connecting->streaming status
> > > transition just before tendering the first byte of streamed WAL to the
> > > startup process. Alternatively, enact that transition when the startup
> > > process accepts the
> > > first streamed byte."
> >
> > > == Proposal ==
> > >
> > > Introduce WALRCV_CONNECTING as an intermediate state between STARTING
> > > and STREAMING:
> > >
> > > - When walreceiver starts, it enters CONNECTING (instead of going
> > > directly to STREAMING).
> > > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
> > > existing spinlock-protected block that updates flushedUpto.
> >
> > I think this has the drawback that if the primary's WAL is incompatible,
> > e.g. unacceptable timeline, the walreceiver will still briefly enter
> > STREAMING.  That could trick monitoring.
>
> Thanks for pointing this out.
>
>  Waiting for applyPtr to advance
> > would avoid the short-lived STREAMING.  What's the feasibility of that?
>
> I think this could work, but with complications. If replay latency is
> high or replay is paused with pg_wal_replay_pause, the WalReceiver
> would stay in the CONNECTING state longer than expected. Whether this
> is ok depends on the definition of the 'connecting' state. For the
> implementation, deciding where and when to check applyPtr against LSNs
> like receiveStart is more difficult—the WalReceiver doesn't know when
> applyPtr advances. While the WalReceiver can read applyPtr from shared
> memory, it isn't automatically notified when that pointer advances.
> This leads to latency between checking and replay if this is done in
> the WalReceiver part unless we let the startup process set the state,
> which would couple the two components. Am I missing something here?
>

After some thoughts, a potential approach could be to expose a new
function in the WAL receiver that transitions the state from
CONNECTING to STREAMING. This function can then be invoked directly
from WaitForWALToBecomeAvailable in the startup process, ensuring the
state change aligns with the actual acceptance of the WAL stream.

--
Best,
Xuneng



Re: Add WALRCV_CONNECTING state to walreceiver

От
Xuneng Zhou
Дата:
Hi,

On Fri, Dec 12, 2025 at 9:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
>
> Hi,
>
> On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> >
> > Hi Noah,
> >
> > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
> > >
> > > On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote:
> > > > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming'
> > > > does not accurately reflect streaming health.  In that discussion,
> > > > Noah noted that even before the reported regression, status =
> > > > 'streaming' was unreliable because walreceiver sets it during early
> > > > startup, before attempting a connection. He suggested:
> > > >
> > > > "Long-term, in master only, perhaps we should introduce another status
> > > > like 'connecting'. Perhaps enact the connecting->streaming status
> > > > transition just before tendering the first byte of streamed WAL to the
> > > > startup process. Alternatively, enact that transition when the startup
> > > > process accepts the
> > > > first streamed byte."
> > >
> > > > == Proposal ==
> > > >
> > > > Introduce WALRCV_CONNECTING as an intermediate state between STARTING
> > > > and STREAMING:
> > > >
> > > > - When walreceiver starts, it enters CONNECTING (instead of going
> > > > directly to STREAMING).
> > > > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the
> > > > existing spinlock-protected block that updates flushedUpto.
> > >
> > > I think this has the drawback that if the primary's WAL is incompatible,
> > > e.g. unacceptable timeline, the walreceiver will still briefly enter
> > > STREAMING.  That could trick monitoring.
> >
> > Thanks for pointing this out.
> >
> >  Waiting for applyPtr to advance
> > > would avoid the short-lived STREAMING.  What's the feasibility of that?
> >
> > I think this could work, but with complications. If replay latency is
> > high or replay is paused with pg_wal_replay_pause, the WalReceiver
> > would stay in the CONNECTING state longer than expected. Whether this
> > is ok depends on the definition of the 'connecting' state. For the
> > implementation, deciding where and when to check applyPtr against LSNs
> > like receiveStart is more difficult—the WalReceiver doesn't know when
> > applyPtr advances. While the WalReceiver can read applyPtr from shared
> > memory, it isn't automatically notified when that pointer advances.
> > This leads to latency between checking and replay if this is done in
> > the WalReceiver part unless we let the startup process set the state,
> > which would couple the two components. Am I missing something here?
> >
>
> After some thoughts, a potential approach could be to expose a new
> function in the WAL receiver that transitions the state from
> CONNECTING to STREAMING. This function can then be invoked directly
> from WaitForWALToBecomeAvailable in the startup process, ensuring the
> state change aligns with the actual acceptance of the WAL stream.
>

V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
the first valid WAL record is processed by the startup process. A new
function WalRcvSetStreaming is introduced to enable the transition.

--
Best,
Xuneng

Вложения

Re: Add WALRCV_CONNECTING state to walreceiver

От
Noah Misch
Дата:
On Sun, Dec 14, 2025 at 12:45:46PM +0800, Xuneng Zhou wrote:
> On Fri, Dec 12, 2025 at 9:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
> > > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote:
> > > > Waiting for applyPtr to advance
> > > > would avoid the short-lived STREAMING.  What's the feasibility of that?
> > >
> > > I think this could work, but with complications. If replay latency is
> > > high or replay is paused with pg_wal_replay_pause, the WalReceiver
> > > would stay in the CONNECTING state longer than expected. Whether this
> > > is ok depends on the definition of the 'connecting' state. For the
> > > implementation, deciding where and when to check applyPtr against LSNs
> > > like receiveStart is more difficult—the WalReceiver doesn't know when
> > > applyPtr advances. While the WalReceiver can read applyPtr from shared
> > > memory, it isn't automatically notified when that pointer advances.
> > > This leads to latency between checking and replay if this is done in
> > > the WalReceiver part unless we let the startup process set the state,
> > > which would couple the two components. Am I missing something here?
> >
> > After some thoughts, a potential approach could be to expose a new
> > function in the WAL receiver that transitions the state from
> > CONNECTING to STREAMING. This function can then be invoked directly
> > from WaitForWALToBecomeAvailable in the startup process, ensuring the
> > state change aligns with the actual acceptance of the WAL stream.
> 
> V2 makes the transition from WALRCV_CONNECTING to STREAMING only when
> the first valid WAL record is processed by the startup process. A new
> function WalRcvSetStreaming is introduced to enable the transition.

The original patch set STREAMING in XLogWalRcvFlush().  XLogWalRcvFlush()
callee XLogWalRcvSendReply() already fetches applyPtr to send a status
message.  So I would try the following before involving the startup process
like v2 does:

1. store the applyPtr when we enter CONNECTING
2. force a status message as long as we remain in CONNECTING
3. become STREAMING when applyPtr differs from the one stored at (1)

A possible issue with all patch versions: when the primary is writing no WAL
and the standby was caught up before this walreceiver started, CONNECTING
could persist for an unbounded amount of time.  Only actual primary WAL
generation would move the walreceiver to STREAMING.  This relates to your
above point about high latency.  If that's a concern, perhaps this change
deserves a total of two new states, CONNECTING and a state that represents
"connection exists, no WAL yet applied"?