Обсуждение: Add WALRCV_CONNECTING state to walreceiver
Hi Hackers, Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming' does not accurately reflect streaming health. In that discussion, Noah noted that even before the reported regression, status = 'streaming' was unreliable because walreceiver sets it during early startup, before attempting a connection. He suggested: "Long-term, in master only, perhaps we should introduce another status like 'connecting'. Perhaps enact the connecting->streaming status transition just before tendering the first byte of streamed WAL to the startup process. Alternatively, enact that transition when the startup process accepts the first streamed byte." Michael and I also thought this could be a useful addition. This patch implements that suggestion by adding a new WALRCV_CONNECTING state. == Background == Currently, walreceiver transitions directly from STARTING to STREAMING early in WalReceiverMain(), before any WAL data has been received. This means status = 'streaming' can be observed even when: - The connection to the primary has not been established - No WAL data has actually been received or flushed This makes it difficult for monitoring tools to distinguish between a healthy streaming replica and one that is merely attempting to stream. == Proposal == Introduce WALRCV_CONNECTING as an intermediate state between STARTING and STREAMING: - When walreceiver starts, it enters CONNECTING (instead of going directly to STREAMING). - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the existing spinlock-protected block that updates flushedUpto. Feedbacks welcome. [1] https://www.postgresql.org/message-id/flat/19093-c4fff49a608f82a0%40postgresql.org -- Best, Xuneng
Вложения
On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote: > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming' > does not accurately reflect streaming health. In that discussion, > Noah noted that even before the reported regression, status = > 'streaming' was unreliable because walreceiver sets it during early > startup, before attempting a connection. He suggested: > > "Long-term, in master only, perhaps we should introduce another status > like 'connecting'. Perhaps enact the connecting->streaming status > transition just before tendering the first byte of streamed WAL to the > startup process. Alternatively, enact that transition when the startup > process accepts the > first streamed byte." > == Proposal == > > Introduce WALRCV_CONNECTING as an intermediate state between STARTING > and STREAMING: > > - When walreceiver starts, it enters CONNECTING (instead of going > directly to STREAMING). > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the > existing spinlock-protected block that updates flushedUpto. I think this has the drawback that if the primary's WAL is incompatible, e.g. unacceptable timeline, the walreceiver will still briefly enter STREAMING. That could trick monitoring. Waiting for applyPtr to advance would avoid the short-lived STREAMING. What's the feasibility of that?
Hi Noah, On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote: > > On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote: > > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming' > > does not accurately reflect streaming health. In that discussion, > > Noah noted that even before the reported regression, status = > > 'streaming' was unreliable because walreceiver sets it during early > > startup, before attempting a connection. He suggested: > > > > "Long-term, in master only, perhaps we should introduce another status > > like 'connecting'. Perhaps enact the connecting->streaming status > > transition just before tendering the first byte of streamed WAL to the > > startup process. Alternatively, enact that transition when the startup > > process accepts the > > first streamed byte." > > > == Proposal == > > > > Introduce WALRCV_CONNECTING as an intermediate state between STARTING > > and STREAMING: > > > > - When walreceiver starts, it enters CONNECTING (instead of going > > directly to STREAMING). > > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the > > existing spinlock-protected block that updates flushedUpto. > > I think this has the drawback that if the primary's WAL is incompatible, > e.g. unacceptable timeline, the walreceiver will still briefly enter > STREAMING. That could trick monitoring. Thanks for pointing this out. Waiting for applyPtr to advance > would avoid the short-lived STREAMING. What's the feasibility of that? I think this could work, but with complications. If replay latency is high or replay is paused with pg_wal_replay_pause, the WalReceiver would stay in the CONNECTING state longer than expected. Whether this is ok depends on the definition of the 'connecting' state. For the implementation, deciding where and when to check applyPtr against LSNs like receiveStart is more difficult—the WalReceiver doesn't know when applyPtr advances. While the WalReceiver can read applyPtr from shared memory, it isn't automatically notified when that pointer advances. This leads to latency between checking and replay if this is done in the WalReceiver part unless we let the startup process set the state, which would couple the two components. Am I missing something here? -- Best, Xuneng
Hi, On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > Hi Noah, > > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote: > > > > On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote: > > > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming' > > > does not accurately reflect streaming health. In that discussion, > > > Noah noted that even before the reported regression, status = > > > 'streaming' was unreliable because walreceiver sets it during early > > > startup, before attempting a connection. He suggested: > > > > > > "Long-term, in master only, perhaps we should introduce another status > > > like 'connecting'. Perhaps enact the connecting->streaming status > > > transition just before tendering the first byte of streamed WAL to the > > > startup process. Alternatively, enact that transition when the startup > > > process accepts the > > > first streamed byte." > > > > > == Proposal == > > > > > > Introduce WALRCV_CONNECTING as an intermediate state between STARTING > > > and STREAMING: > > > > > > - When walreceiver starts, it enters CONNECTING (instead of going > > > directly to STREAMING). > > > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the > > > existing spinlock-protected block that updates flushedUpto. > > > > I think this has the drawback that if the primary's WAL is incompatible, > > e.g. unacceptable timeline, the walreceiver will still briefly enter > > STREAMING. That could trick monitoring. > > Thanks for pointing this out. > > Waiting for applyPtr to advance > > would avoid the short-lived STREAMING. What's the feasibility of that? > > I think this could work, but with complications. If replay latency is > high or replay is paused with pg_wal_replay_pause, the WalReceiver > would stay in the CONNECTING state longer than expected. Whether this > is ok depends on the definition of the 'connecting' state. For the > implementation, deciding where and when to check applyPtr against LSNs > like receiveStart is more difficult—the WalReceiver doesn't know when > applyPtr advances. While the WalReceiver can read applyPtr from shared > memory, it isn't automatically notified when that pointer advances. > This leads to latency between checking and replay if this is done in > the WalReceiver part unless we let the startup process set the state, > which would couple the two components. Am I missing something here? > After some thoughts, a potential approach could be to expose a new function in the WAL receiver that transitions the state from CONNECTING to STREAMING. This function can then be invoked directly from WaitForWALToBecomeAvailable in the startup process, ensuring the state change aligns with the actual acceptance of the WAL stream. -- Best, Xuneng
Hi, On Fri, Dec 12, 2025 at 9:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > Hi, > > On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > > > Hi Noah, > > > > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote: > > > > > > On Fri, Dec 12, 2025 at 12:51:00PM +0800, Xuneng Zhou wrote: > > > > Bug #19093 [1] reported that pg_stat_wal_receiver.status = 'streaming' > > > > does not accurately reflect streaming health. In that discussion, > > > > Noah noted that even before the reported regression, status = > > > > 'streaming' was unreliable because walreceiver sets it during early > > > > startup, before attempting a connection. He suggested: > > > > > > > > "Long-term, in master only, perhaps we should introduce another status > > > > like 'connecting'. Perhaps enact the connecting->streaming status > > > > transition just before tendering the first byte of streamed WAL to the > > > > startup process. Alternatively, enact that transition when the startup > > > > process accepts the > > > > first streamed byte." > > > > > > > == Proposal == > > > > > > > > Introduce WALRCV_CONNECTING as an intermediate state between STARTING > > > > and STREAMING: > > > > > > > > - When walreceiver starts, it enters CONNECTING (instead of going > > > > directly to STREAMING). > > > > - The transition to STREAMING occurs in XLogWalRcvFlush(), inside the > > > > existing spinlock-protected block that updates flushedUpto. > > > > > > I think this has the drawback that if the primary's WAL is incompatible, > > > e.g. unacceptable timeline, the walreceiver will still briefly enter > > > STREAMING. That could trick monitoring. > > > > Thanks for pointing this out. > > > > Waiting for applyPtr to advance > > > would avoid the short-lived STREAMING. What's the feasibility of that? > > > > I think this could work, but with complications. If replay latency is > > high or replay is paused with pg_wal_replay_pause, the WalReceiver > > would stay in the CONNECTING state longer than expected. Whether this > > is ok depends on the definition of the 'connecting' state. For the > > implementation, deciding where and when to check applyPtr against LSNs > > like receiveStart is more difficult—the WalReceiver doesn't know when > > applyPtr advances. While the WalReceiver can read applyPtr from shared > > memory, it isn't automatically notified when that pointer advances. > > This leads to latency between checking and replay if this is done in > > the WalReceiver part unless we let the startup process set the state, > > which would couple the two components. Am I missing something here? > > > > After some thoughts, a potential approach could be to expose a new > function in the WAL receiver that transitions the state from > CONNECTING to STREAMING. This function can then be invoked directly > from WaitForWALToBecomeAvailable in the startup process, ensuring the > state change aligns with the actual acceptance of the WAL stream. > V2 makes the transition from WALRCV_CONNECTING to STREAMING only when the first valid WAL record is processed by the startup process. A new function WalRcvSetStreaming is introduced to enable the transition. -- Best, Xuneng
Вложения
On Sun, Dec 14, 2025 at 12:45:46PM +0800, Xuneng Zhou wrote: > On Fri, Dec 12, 2025 at 9:52 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > On Fri, Dec 12, 2025 at 4:45 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > > On Fri, Dec 12, 2025 at 1:05 PM Noah Misch <noah@leadboat.com> wrote: > > > > Waiting for applyPtr to advance > > > > would avoid the short-lived STREAMING. What's the feasibility of that? > > > > > > I think this could work, but with complications. If replay latency is > > > high or replay is paused with pg_wal_replay_pause, the WalReceiver > > > would stay in the CONNECTING state longer than expected. Whether this > > > is ok depends on the definition of the 'connecting' state. For the > > > implementation, deciding where and when to check applyPtr against LSNs > > > like receiveStart is more difficult—the WalReceiver doesn't know when > > > applyPtr advances. While the WalReceiver can read applyPtr from shared > > > memory, it isn't automatically notified when that pointer advances. > > > This leads to latency between checking and replay if this is done in > > > the WalReceiver part unless we let the startup process set the state, > > > which would couple the two components. Am I missing something here? > > > > After some thoughts, a potential approach could be to expose a new > > function in the WAL receiver that transitions the state from > > CONNECTING to STREAMING. This function can then be invoked directly > > from WaitForWALToBecomeAvailable in the startup process, ensuring the > > state change aligns with the actual acceptance of the WAL stream. > > V2 makes the transition from WALRCV_CONNECTING to STREAMING only when > the first valid WAL record is processed by the startup process. A new > function WalRcvSetStreaming is introduced to enable the transition. The original patch set STREAMING in XLogWalRcvFlush(). XLogWalRcvFlush() callee XLogWalRcvSendReply() already fetches applyPtr to send a status message. So I would try the following before involving the startup process like v2 does: 1. store the applyPtr when we enter CONNECTING 2. force a status message as long as we remain in CONNECTING 3. become STREAMING when applyPtr differs from the one stored at (1) A possible issue with all patch versions: when the primary is writing no WAL and the standby was caught up before this walreceiver started, CONNECTING could persist for an unbounded amount of time. Only actual primary WAL generation would move the walreceiver to STREAMING. This relates to your above point about high latency. If that's a concern, perhaps this change deserves a total of two new states, CONNECTING and a state that represents "connection exists, no WAL yet applied"?