Re: [Bug fix]There is the case archive_timeout parameter isignored after recovery works.

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: [Bug fix]There is the case archive_timeout parameter isignored after recovery works.
Дата
Msg-id 20200629.164111.1368221830399585184.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на [Bug fix]There is the case archive_timeout parameter is ignored afterrecovery works.  ("higuchi.daisuke@fujitsu.com" <higuchi.daisuke@fujitsu.com>)
Ответы Re: [Bug fix]There is the case archive_timeout parameter isignored after recovery works.  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Re: [Bug fix]There is the case archive_timeout parameter is ignoredafter recovery works.  (Fujii Masao <masao.fujii@oss.nttdata.com>)
RE: [Bug fix]There is the case archive_timeout parameter is ignoredafter recovery works.  ("higuchi.daisuke@fujitsu.com" <higuchi.daisuke@fujitsu.com>)
Список pgsql-hackers
Hello.

At Mon, 29 Jun 2020 04:35:11 +0000, "higuchi.daisuke@fujitsu.com" <higuchi.daisuke@fujitsu.com> wrote in 
> Hi,
> 
> I found the bug about archive_timeout parameter.
> There is the case archive_timeout parameter is ignored after recovery works.
...
> [Problem]
> When the value of archive_timeout is smaller than that of checkpoint_timeout and recovery works, archive_timeout is
ignoredin the first WAL archiving.
 
> Once WAL is archived, the archive_timeout seems to be valid after that.
...
> Currently, cur_timeout is set according to only checkpoint_timeout when it is during recovery.
> Even during recovery, the cur_timeout should be calculated including archive_timeout as well as checkpoint_timeout, I
think.
> I attached the patch to solve this problem.

Unfortunately the diff command in your test script doesn't show me
anything, but I can understand what you are thinking is a problem,
maybe.  But the patch doesn't seem the fix for the issue.

Archiving works irrelevantly from that parameter. Completed WAL
segments are immediately marked as ".ready" and archiver does its task
immediately independently from checkpointer. The parameter, as
described in documentation, forces the server to switch to a new WAL
segment file periodically so that it can be archived, that is, it
works only on primary.  On the other hand on standby, a WAL segment is
not marked as ".ready" until any data for the *next* segment comes. So
the patch is not the fix for the issue.

If primary switched segment and archived it but standby didn't archive
the same immediately, you could force that by writing something on the
master.

Anyway, the attached patch would resolve your problem.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From fec10780e132c1d284c66355c5215c284c16204d Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Mon, 29 Jun 2020 16:12:01 +0900
Subject: [PATCH] Let complete segment archived immediately on standy

walreceiver marks a completed segment as ".ready" after any data for
the next segment comes. So standby can archive a WAL segment later
than the primary archives the same segment.  Let walreceiver archive a
segment as soon as it is completed.
---
 src/backend/replication/walreceiver.c | 77 +++++++++++++++------------
 1 file changed, 42 insertions(+), 35 deletions(-)

diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index d1ad75da87..06c1e3cbe4 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -902,44 +902,10 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
     {
         int            segbytes;
 
-        if (recvFile < 0 || !XLByteInSeg(recptr, recvSegNo, wal_segment_size))
+        if (recvFile < 0)
         {
             bool        use_existent;
 
-            /*
-             * fsync() and close current file before we switch to next one. We
-             * would otherwise have to reopen this file to fsync it later
-             */
-            if (recvFile >= 0)
-            {
-                char        xlogfname[MAXFNAMELEN];
-
-                XLogWalRcvFlush(false);
-
-                XLogFileName(xlogfname, recvFileTLI, recvSegNo, wal_segment_size);
-
-                /*
-                 * XLOG segment files will be re-read by recovery in startup
-                 * process soon, so we don't advise the OS to release cache
-                 * pages associated with the file like XLogFileClose() does.
-                 */
-                if (close(recvFile) != 0)
-                    ereport(PANIC,
-                            (errcode_for_file_access(),
-                             errmsg("could not close log segment %s: %m",
-                                    xlogfname)));
-
-                /*
-                 * Create .done file forcibly to prevent the streamed segment
-                 * from being archived later.
-                 */
-                if (XLogArchiveMode != ARCHIVE_MODE_ALWAYS)
-                    XLogArchiveForceDone(xlogfname);
-                else
-                    XLogArchiveNotify(xlogfname);
-            }
-            recvFile = -1;
-
             /* Create/use new log file */
             XLByteToSeg(recptr, recvSegNo, wal_segment_size);
             use_existent = true;
@@ -985,6 +951,47 @@ XLogWalRcvWrite(char *buf, Size nbytes, XLogRecPtr recptr)
         buf += byteswritten;
 
         LogstreamResult.Write = recptr;
+
+        /*
+         * Close the current WAL segment if it is completed then let the file
+         * be archived if needed.
+         */
+        if (!XLByteInSeg(recptr, recvSegNo, wal_segment_size))
+        {
+            char        xlogfname[MAXFNAMELEN];
+
+            Assert (recvFile >= 0);
+
+            /*
+             * fsync() and close current file before we switch to next one. We
+             * would otherwise have to reopen this file to fsync it later
+             */
+            XLogWalRcvFlush(false);
+
+            XLogFileName(xlogfname, recvFileTLI, recvSegNo, wal_segment_size);
+
+            /*
+             * XLOG segment files will be re-read by recovery in startup
+             * process soon, so we don't advise the OS to release cache
+             * pages associated with the file like XLogFileClose() does.
+             */
+            if (close(recvFile) != 0)
+                ereport(PANIC,
+                        (errcode_for_file_access(),
+                         errmsg("could not close log segment %s: %m",
+                                xlogfname)));
+
+            /*
+             * Create .done file forcibly to prevent the streamed segment
+             * from being archived later.
+             */
+            if (XLogArchiveMode != ARCHIVE_MODE_ALWAYS)
+                XLogArchiveForceDone(xlogfname);
+            else
+                XLogArchiveNotify(xlogfname);
+
+            recvFile = -1;
+        }
     }
 
     /* Update shared-memory status */
-- 
2.18.4


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: [PATCH] COPY command's data format option allows only lowercasecsv, text or binary
Следующее
От: "Andrey V. Lepikhov"
Дата:
Сообщение: Re: POC and rebased patch for CSN based snapshots