Re: Possible missing segments in archiving on standby

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Possible missing segments in archiving on standby
Дата
Msg-id 20210901.121225.1339494423357751537.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Possible missing segments in archiving on standby  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Ответы Re: Possible missing segments in archiving on standby  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Список pgsql-hackers
At Tue, 31 Aug 2021 23:23:27 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in 
> 
> 
> On 2021/08/31 16:35, Kyotaro Horiguchi wrote:
> > I'm not sure which is simpler, but it works except for B, the case of
> > a long-jump by a segment switch.  When a segment switch happens,
> > walsender sends filling zero-pages but even if walreceiver is
> > terminated before the segment is completed, walsender restarts from
> > the next segment at the next startup. Concretely like the following.
> > - pg_switch_wal() invoked at 6003228 (for example)
> > - walreceiver terminates at 6500000 (or a bit later).
> > - walrecever rstarts from 7000000
> > In this case the segment 6 is not notified even with the patch, and my
> > old patches works the same way. (In other words, the call to
> > XLogWalRcvClose() at the end of XLogWalRcvWrite doens't work for the
> > case as you might expect.) If we think it ok that we don't notify the
> > segment earlier than a future checkpoint removes it, yours or only the
> > last half of my one is sufficient, but do we really think so?
> > Furthermore, your patch or only the last half of my second patch
> > doesn't save the case of a crash unlike the case of a graceful
> > termination.
> 
> Thanks for the clarification!
> Please let me check my understanding about the issue.
> 
> The issue happens when walreceiver exits after it receives XLOG_SWITCH
> record
> but before receives the remaining bytes of the segment including that
> XLOG_SWITCH record. In this case, the startup process tries to replay
> that
> "half-received" segment, finds XLOG_SWITCH record in it, moves to the
> next
> segment and then starts new walreceiver from that next
> segment. Therefore,
> even with my patch, the segment including that XLOG_SWITCH record is
> not
> archived soon. Is my understanding right? I agree that we should
> address also
> this issue.

Right.

> ISTM, to address the issue, it's simpler and less fragile to make the
> startup
> process call XLogArchiveCheckDone() or something whenever it moves
> the next segment, rather than make walreceiver do that. Thought?

Putting aside the issue C, it would work as far as recovery is not
paused or delayed.  Although simply doing that means we run additional
and a bit) wasteful XLogArchiveCheckDone() in most cases, It's hard to
imagine moving the responsibility to notify a finished segment from
walsender (writer side) to startup (reader side).

In the first place A and B happens only at termination or crash of
walsender so there's no fragility in checking only the previous
segment at start of walsender.  After a bit thought I noticed that we
don't need to do that in the wal-writing loop. And I noticed that we
need to consider timeline transitions while calculating the previous
segment.  Even though missing-notification at a timeline-switch
doesn't happen unless walsender is killed hard for example by a
sigkill or a power cut, though.

So the attached is a new version of the patch to fix only A and B.

- Moved the check code out of the replication loop.

- Track timeline transition while calculating the previous segment.
  If we don't do that, we would need another means to avoid notifying
  non-existent segment instead of the correct one.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 60de3be92c..81dde27372 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -173,6 +173,7 @@ WalReceiverMain(void)
     XLogRecPtr    startpoint;
     TimeLineID    startpointTLI;
     TimeLineID    primaryTLI;
+    XLogSegNo    startsegno;
     bool        first_stream;
     WalRcvData *walrcv = WalRcv;
     TimestampTz last_recv_timestamp;
@@ -313,6 +314,32 @@ WalReceiverMain(void)
     if (sender_host)
         pfree(sender_host);
 
+    /*
+     * There's a case walreceiver terminated before notifying the last
+     * finished segment. Make sure the last finished segment is archived
+     * immediately.
+     */
+    XLByteToSeg(startpoint, startsegno, wal_segment_size);
+    if (startsegno > 1)
+    {
+        char         xlogfname[MAXFNAMELEN];
+        TimeLineID    prevsegTLI;
+        XLogRecPtr    prevsegEndRecPtr;
+        List       *tles;
+
+        /*
+         * The previous segment may be in the previous timeline.  Track
+         * timelines to find the segment on the correct timeline.
+         */
+        tles = readTimeLineHistory(startpointTLI);
+        prevsegEndRecPtr =
+            startpoint - XLogSegmentOffset(startpoint, wal_segment_size) - 1;
+        prevsegTLI = tliOfPointInHistory(prevsegEndRecPtr, tles);
+        XLogFileName(xlogfname, prevsegTLI, startsegno - 1,
+                     wal_segment_size);
+        XLogArchiveCheckDone(xlogfname);
+    }
+        
     first_stream = true;
     for (;;)
     {

В списке pgsql-hackers по дате отправления:

Предыдущее
От: shawn wang
Дата:
Сообщение: Re: Is it worth pushing conditions to sublink/subplan?
Следующее
От: Andres Freund
Дата:
Сообщение: Re: prevent immature WAL streaming