Re: [HACKERS] Restricting maximum keep segments by repslots

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: [HACKERS] Restricting maximum keep segments by repslots
Дата
Msg-id 20180709.144706.258526585.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] Restricting maximum keep segments by repslots  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: [HACKERS] Restricting maximum keep segments by repslots  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
Hello. Sawada-san.

Thank you for the comments.

At Thu, 5 Jul 2018 15:43:56 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in
<CAD21AoDiiA4qHj0thqw80Bt=vefSQ9yGpZnr0kuLTXszbrV9iQ@mail.gmail.com>
> On Wed, Jul 4, 2018 at 5:28 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > Hello.
> >
> > At Tue, 26 Jun 2018 16:26:59 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote
in<20180626.162659.223208514.horiguchi.kyotaro@lab.ntt.co.jp>
 
> >> The previous patche files doesn't have version number so I let
> >> the attached latest version be v2.
> >>
> >>
> >> v2-0001-Add-WAL-releaf-vent-for-replication-slots.patch
> >>   The body of the limiting feature
> >>
> >> v2-0002-Add-monitoring-aid-for-max_replication_slots.patch
> >>   Shows the status of WAL rataining in pg_replication_slot view
> >>
> >> v2-0003-TAP-test-for-the-slot-limit-feature.patch
> >>   TAP test for this feature
> >>
> >> v2-0004-Documentation-for-slot-limit-feature.patch
> >>   Documentation, as the name.
> >
> > Travis (test_decoding test) showed that GetOldestXLogFileSegNo
> > added by 0002 forgets to close temporarily opened pg_wal
> > directory. This is the fixed version v3.
> >
> 
> Thank you for updating the patch! I looked at v3 patches. Here is
> review comments.
> 
> ---
> +               {"max_slot_wal_keep_size", PGC_SIGHUP, REPLICATION_SENDING,
> +                       gettext_noop("Sets the maximum size of extra
> WALs kept by replication slots."),
> +                NULL,
> +                GUC_UNIT_MB
> +               },
> +               &max_slot_wal_keep_size_mb,
> +               0, 0, INT_MAX,
> +               NULL, NULL, NULL
> +       },
> 
> Maybe MAX_KILOBYTES would be better instead of INT_MAX like wal_max_size.

The MAX_KILOBYTES is maximum value of size in kB, which fits long
or Size/size_t variables after convreted into bytes. Applying the
limit there means that we assume that the _mb variable can be
converted not into bytes but kB. So applying it to
max/min_wal_size seems somewhat wrong but doesn't harm since they
are not acutually converted into bytes.

max_slot_wal_keep_size is not converted into bytes so capping
with INT_MAX is no problem. However it doesn't need to be larger
than MAX_KILOBYTES, I follow that in order to make it same with
max/min_wal_size.

> ---
> Once the following WARNING emitted this message is emitted whenever we
> execute CHECKPOINT even if we don't change anything. Is that expected
> behavior? I think it would be better to emit this message only when we
> remove WAL segements that are required by slots.
> 
> WARNING:  some replication slots have lost required WAL segments
> DETAIL:  The mostly affected slot has lost 153 segments.

I didn't consider the situation the number of lost segments
doesn't change. Changed to mute the message when the number of
lost segments is not changed.

> ---
> +       Assert(wal_keep_segments >= 0);
> +       Assert(max_slot_wal_keep_size_mb >= 0);
> 
> These assertions are not meaningful because these parameters are
> ensured >= 0 by those definition.

Yeah, that looks a bit being paranoid. Removed.

> ---
> +    /* slots aren't useful, consider only wal_keep_segments */
> +    if (slotpos == InvalidXLogRecPtr)
> +    {
> 
> Perhaps XLogRecPtrIsInvalid(slotpos) is better.

Agreed. It is changed to "slotpos != InvalidXLogRecPtr" after
changing the function by the comments below. I think that the
double negation !XLogRecPtrInvalid() is not fine.

> ---
> +    if (slotpos != InvalidXLogRecPtr && currSeg <= slotSeg + wal_keep_segments)
> +        slotpos = InvalidXLogRecPtr;
> +
> +    /* slots aren't useful, consider only wal_keep_segments */
> +    if (slotpos == InvalidXLogRecPtr)
> +    {
> 
> This logic is hard to read to me. The slotpos can be any of: valid,
> valid but then become invalid in halfway or invalid from beginning of
> this function. Can we convert this logic to following?
> 
> if (XLogRecPtrIsInvalid(slotpos) ||
>     currSeg <= slotSeg + wal_keep_segments)

Right. But it is removed.

> ---
> +    keepSegs = wal_keep_segments +
> +        ConvertToXSegs(max_slot_wal_keep_size_mb, wal_segment_size);
> 
> Why do we need to keep (wal_keep_segment + max_slot_wal_keep_size) WAL
> segments? I think what this feature does is, if wal_keep_segments is
> not useful (that is, slotSeg < (currSeg - wal_keep_segment) then we
> normally choose slotSeg as lower boundary but max_slot_wal_keep_size
> restrict the lower boundary so that it doesn't get lower than the
> threshold. So I thought what this function should do is to calculate
> min(currSeg - wal_keep_segment, max(currSeg - max_slot_wal_keep_size,
> slotSeg)), I might be missing something though.

You're right that wal_keep_segments should not been added, but
should give lower limit of segments to keep as the current
KeepLogSeg() does. Fixed that.

Since the amount is specified in mega bytes, silently rounding
down to segment bounds may not be proper in general and this
feature used to use the fragments to show users something. But
there's no loner a place where the fragments are perceptible to
users and anyway the fragments are way smaller than the expected
total WAL size.

As the result, I removed the fragment calculation at all as you
suggested. It gets way smaller and simpler.

> ---
> +    SpinLockAcquire(&XLogCtl->info_lck);
> +    oldestSeg = XLogCtl->lastRemovedSegNo;
> +    SpinLockRelease(&XLogCtl->info_lck);
> 
> We can use XLogGetLastRemovedSegno() instead.

It is because I thought that it is for external usage,
spcifically by slot.c since CheckXLogRemoved() is reading it
directly. I leave it alone and they would have to be fixed at
once if we decide to use it internally.

> ---
> +    xldir = AllocateDir(XLOGDIR);
> +    if (xldir == NULL)
> +        ereport(ERROR,
> +                (errcode_for_file_access(),
> +                 errmsg("could not open write-ahead log directory \"%s\": %m",
> +                        XLOGDIR)));
> 
> Looking at other code allocating a directory we don't check xldir ==
> NULL because it will be detected by ReadDir() function and raise an
> error in that function. So maybe we don't need to check it just after
> allocation.

Thanks. I found that in the comment of ReadDir(). This doesn't
need a special error handling so I leave it to ReadDir there.

Addition to that, documentation is fixed.


Attached is the v4 files.



regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 7e80eb619a82d9fd0a4ed9c91186fe4c21016622 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 21 Dec 2017 21:20:20 +0900
Subject: [PATCH 1/4] Add WAL releaf vent for replication slots

Adds a capability to limit the number of segments kept by replication
slots by a GUC variable.
---
 src/backend/access/transam/xlog.c             | 100 ++++++++++++++++++++------
 src/backend/utils/misc/guc.c                  |  12 ++++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/access/xlog.h                     |   1 +
 4 files changed, 94 insertions(+), 20 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 44017d33e4..c27a589e89 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -105,6 +105,7 @@ int            wal_level = WAL_LEVEL_MINIMAL;
 int            CommitDelay = 0;    /* precommit delay in microseconds */
 int            CommitSiblings = 5; /* # concurrent xacts needed to sleep */
 int            wal_retrieve_retry_interval = 5000;
+int            max_slot_wal_keep_size_mb = 0;
 
 #ifdef WAL_DEBUG
 bool        XLOG_DEBUG = false;
@@ -867,6 +868,7 @@ static void checkTimeLineSwitch(XLogRecPtr lsn, TimeLineID newTLI,
 static void LocalSetXLogInsertAllowed(void);
 static void CreateEndOfRecoveryRecord(void);
 static void CheckPointGuts(XLogRecPtr checkPointRedo, int flags);
+static XLogSegNo GetOldestKeepSegment(XLogRecPtr currpos, XLogRecPtr minSlotPtr);
 static void KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo);
 static XLogRecPtr XLogGetReplicationSlotMinimumLSN(void);
 
@@ -9424,6 +9426,51 @@ CreateRestartPoint(int flags)
     return true;
 }
 
+/*
+ * Returns minimum segment number the next checktpoint must leave considering
+ * wal_keep_segments, replication slots and max_slot_wal_keep_size.
+ */
+static XLogSegNo
+GetOldestKeepSegment(XLogRecPtr currLSN, XLogRecPtr minSlotLSN)
+{
+    uint64        keepSegs = 0;
+    XLogSegNo    currSeg;
+    XLogSegNo    slotSeg;
+
+    XLByteToSeg(currLSN, currSeg, wal_segment_size);
+    XLByteToSeg(minSlotLSN, slotSeg, wal_segment_size);
+
+    /*
+     * Calcualte keep segments by slots first. The second term of the
+     * condition is just a sanity check.
+     */
+    if (minSlotLSN != InvalidXLogRecPtr && slotSeg <= currSeg)
+        keepSegs = currSeg - slotSeg;
+
+    /*
+     * slot keep segments is limited by max_slot_wal_keep_size, fragment of a
+     * segment is ignored
+     */
+    if (max_slot_wal_keep_size_mb > 0)
+    {
+        uint64 limitSegs;
+
+        limitSegs = ConvertToXSegs(max_slot_wal_keep_size_mb, wal_segment_size);
+        if (limitSegs < keepSegs)
+            keepSegs = limitSegs;
+    }
+
+    /* but, keep larger than wal_segment_size if any*/
+    if (wal_keep_segments > 0 && keepSegs < wal_keep_segments)
+        keepSegs = wal_keep_segments;
+
+    /* avoid underflow, don't go below 1 */
+    if (currSeg <= keepSegs)
+        return 1;
+
+    return currSeg - keepSegs;
+}
+
 /*
  * Retreat *logSegNo to the last segment that we need to retain because of
  * either wal_keep_segments or replication slots.
@@ -9436,33 +9483,46 @@ static void
 KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
 {
     XLogSegNo    segno;
-    XLogRecPtr    keep;
+    XLogRecPtr    slotminptr = InvalidXLogRecPtr;
+    XLogSegNo    minSegNo;
+    XLogSegNo    slotSegNo;
 
     XLByteToSeg(recptr, segno, wal_segment_size);
-    keep = XLogGetReplicationSlotMinimumLSN();
 
-    /* compute limit for wal_keep_segments first */
-    if (wal_keep_segments > 0)
+    if (max_replication_slots > 0)
+        slotminptr = XLogGetReplicationSlotMinimumLSN();
+
+    /*
+     * We should keep certain number of WAL segments after this checktpoint.
+     */
+    minSegNo = GetOldestKeepSegment(recptr, slotminptr);
+
+    /*
+     * warn if the checkpoint flushes the segments required by replication
+     * slots.
+     */
+    if (!XLogRecPtrIsInvalid(slotminptr))
     {
-        /* avoid underflow, don't go below 1 */
-        if (segno <= wal_keep_segments)
-            segno = 1;
+        static XLogSegNo prev_lost_segs = 0;    /* avoid duplicate messages */
+
+        XLByteToSeg(slotminptr, slotSegNo, wal_segment_size);
+
+        if (slotSegNo < minSegNo)
+        {
+            XLogSegNo lost_segs = minSegNo - slotSegNo;
+            if (prev_lost_segs != lost_segs)
+                ereport(WARNING,
+                        (errmsg ("some replication slots have lost required WAL segments"),
+                         errdetail("The mostly affected slot has lost %ld segments.",
+                                   lost_segs)));
+            prev_lost_segs = lost_segs;
+        }
         else
-            segno = segno - wal_keep_segments;
+            prev_lost_segs = 0;
     }
 
-    /* then check whether slots limit removal further */
-    if (max_replication_slots > 0 && keep != InvalidXLogRecPtr)
-    {
-        XLogSegNo    slotSegNo;
-
-        XLByteToSeg(keep, slotSegNo, wal_segment_size);
-
-        if (slotSegNo <= 0)
-            segno = 1;
-        else if (slotSegNo < segno)
-            segno = slotSegNo;
-    }
+    if (minSegNo < segno)
+        segno = minSegNo;
 
     /* don't delete WAL segments newer than the calculated segment */
     if (segno < *logSegNo)
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 17292e04fe..01b8c8edec 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2530,6 +2530,18 @@ static struct config_int ConfigureNamesInt[] =
         NULL, NULL, NULL
     },
 
+    {
+        {"max_slot_wal_keep_size", PGC_SIGHUP, REPLICATION_SENDING,
+            gettext_noop("Sets the maximum size of extra WALs kept by replication slots."),
+         NULL,
+         GUC_UNIT_MB
+        },
+        &max_slot_wal_keep_size_mb,
+        0, 0,
+        MAX_KILOBYTES, /* XXX: This is in megabytes, like max/min_wal_size */
+        NULL, NULL, NULL
+    },
+
     {
         {"wal_sender_timeout", PGC_SIGHUP, REPLICATION_SENDING,
             gettext_noop("Sets the maximum time to wait for WAL replication."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 9e39baf466..0e605a1765 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -239,6 +239,7 @@
 #max_wal_senders = 10        # max number of walsender processes
                 # (change requires restart)
 #wal_keep_segments = 0        # in logfile segments; 0 disables
+#max_slot_wal_keep_size = 0    # measured in bytes; 0 disables
 #wal_sender_timeout = 60s    # in milliseconds; 0 disables
 
 #max_replication_slots = 10    # max number of replication slots
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 421ba6d775..12cd0d1d10 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -98,6 +98,7 @@ extern int    wal_segment_size;
 extern int    min_wal_size_mb;
 extern int    max_wal_size_mb;
 extern int    wal_keep_segments;
+extern int    max_slot_wal_keep_size_mb;
 extern int    XLOGbuffers;
 extern int    XLogArchiveTimeout;
 extern int    wal_retrieve_retry_interval;
-- 
2.16.3

From 4e3a0ca744620c129c512b47e2656e697762c42f Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 21 Dec 2017 21:23:25 +0900
Subject: [PATCH 2/4] Add monitoring aid for max_replication_slots.

Adds two columns "status" and "min_secure_lsn" in pg_replication_slot.
Setting max_slot_wal_keep_size, long-disconnected slots may lose sync.
The two columns shows that a slot can be reconnected or not, or about
to lose required WAL segments. And the LSN back to where the next
checkpoint will secure.
---
 contrib/test_decoding/expected/ddl.out |  4 +-
 src/backend/access/transam/xlog.c      | 89 ++++++++++++++++++++++++++++++++++
 src/backend/catalog/system_views.sql   |  4 +-
 src/backend/replication/slotfuncs.c    | 25 +++++++++-
 src/include/access/xlog.h              |  1 +
 src/include/catalog/pg_proc.dat        |  6 +--
 src/test/regress/expected/rules.out    |  6 ++-
 7 files changed, 126 insertions(+), 9 deletions(-)

diff --git a/contrib/test_decoding/expected/ddl.out b/contrib/test_decoding/expected/ddl.out
index b7c76469fc..276f7f6efd 100644
--- a/contrib/test_decoding/expected/ddl.out
+++ b/contrib/test_decoding/expected/ddl.out
@@ -706,7 +706,7 @@ SELECT pg_drop_replication_slot('regression_slot');
 
 /* check that the slot is gone */
 SELECT * FROM pg_replication_slots;
- slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin |
restart_lsn| confirmed_flush_lsn 
 

------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
+ slot_name | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin |
restart_lsn| confirmed_flush_lsn | wal_status | min_keep_lsn 
 

+-----------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------+------------+--------------
 (0 rows)
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index c27a589e89..577be4ecd9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -9426,6 +9426,95 @@ CreateRestartPoint(int flags)
     return true;
 }
 
+
+/*
+ * Returns the segment number of the oldest file in XLOG directory.
+ */
+static XLogSegNo
+GetOldestXLogFileSegNo(void)
+{
+    DIR        *xldir;
+    struct dirent *xlde;
+    XLogSegNo segno = 0;
+
+    xldir = AllocateDir(XLOGDIR);
+    while ((xlde = ReadDir(xldir, XLOGDIR)) != NULL)
+    {
+        TimeLineID tli;
+        XLogSegNo fsegno;
+
+        /* Ignore files that are not XLOG segments */
+        if (!IsXLogFileName(xlde->d_name) &&
+            !IsPartialXLogFileName(xlde->d_name))
+            continue;
+
+        XLogFromFileName(xlde->d_name, &tli, &fsegno, wal_segment_size);
+
+        /* get minimum segment ignorig timeline ID */
+        if (segno == 0 || fsegno < segno)
+            segno = fsegno;
+    }
+
+    FreeDir(xldir);
+
+    return segno;
+}
+
+/*
+ * Check if the record on the given restartLSN is present in XLOG files.
+ *
+ * Returns true if it is present. If minKeepLSN is given, it receives the
+ * LSN at the beginning of the oldest existing WAL segment.
+ */
+bool
+IsLsnStillAvaiable(XLogRecPtr restartLSN, XLogRecPtr *minKeepLSN)
+{
+    XLogRecPtr currpos;
+    XLogSegNo restartSeg;
+    XLogSegNo tailSeg;
+    XLogSegNo oldestSeg;
+
+    Assert(!XLogRecPtrIsInvalid(restartLSN));
+
+    currpos = GetXLogWriteRecPtr();
+
+    SpinLockAcquire(&XLogCtl->info_lck);
+    oldestSeg = XLogCtl->lastRemovedSegNo;
+    SpinLockRelease(&XLogCtl->info_lck);
+
+    /*
+     * oldestSeg is zero before at least one segment has been removed since
+     * startup. Use oldest segno taken from file names.
+     */
+    if (oldestSeg == 0)
+    {
+        static XLogSegNo oldestFileSeg = 0;
+
+        if (oldestFileSeg == 0)
+            oldestFileSeg = GetOldestXLogFileSegNo();
+        /* let it have the same meaning with lastRemovedSegNo here */
+        oldestSeg = oldestFileSeg - 1;
+    }
+
+    /* oldest segment is just after the last removed segment */
+    oldestSeg++;
+
+    XLByteToSeg(restartLSN, restartSeg, wal_segment_size);
+
+
+    if (minKeepLSN)
+    {
+        XLogRecPtr slotPtr = XLogGetReplicationSlotMinimumLSN();
+        Assert(!XLogRecPtrIsInvalid(slotPtr));
+
+        tailSeg = GetOldestKeepSegment(currpos, slotPtr);
+
+        XLogSegNoOffsetToRecPtr(tailSeg, 0, *minKeepLSN, wal_segment_size);
+    }
+
+    return    oldestSeg <= restartSeg;
+}
+
 /*
  * Returns minimum segment number the next checktpoint must leave considering
  * wal_keep_segments, replication slots and max_slot_wal_keep_size.
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 8cd8bf40ac..1664a086e9 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -797,7 +797,9 @@ CREATE VIEW pg_replication_slots AS
             L.xmin,
             L.catalog_xmin,
             L.restart_lsn,
-            L.confirmed_flush_lsn
+            L.confirmed_flush_lsn,
+            L.wal_status,
+            L.min_keep_lsn
     FROM pg_get_replication_slots() AS L
             LEFT JOIN pg_database D ON (L.datoid = D.oid);
 
diff --git a/src/backend/replication/slotfuncs.c b/src/backend/replication/slotfuncs.c
index 2806e1076c..f13aa4d455 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -185,7 +185,7 @@ pg_drop_replication_slot(PG_FUNCTION_ARGS)
 Datum
 pg_get_replication_slots(PG_FUNCTION_ARGS)
 {
-#define PG_GET_REPLICATION_SLOTS_COLS 11
+#define PG_GET_REPLICATION_SLOTS_COLS 13
     ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
     TupleDesc    tupdesc;
     Tuplestorestate *tupstore;
@@ -307,6 +307,29 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
         else
             nulls[i++] = true;
 
+        if (restart_lsn == InvalidXLogRecPtr)
+        {
+            values[i++] = CStringGetTextDatum("unknown");
+            values[i++] = LSNGetDatum(InvalidXLogRecPtr);
+        }
+        else
+        {
+            XLogRecPtr    min_keep_lsn;
+            char *status = "lost";
+
+            if (BoolGetDatum(IsLsnStillAvaiable(restart_lsn,
+                                                &min_keep_lsn)))
+            {
+                if (min_keep_lsn <= restart_lsn)
+                    status = "streaming";
+                else
+                    status = "keeping";
+            }
+
+            values[i++] = CStringGetTextDatum(status);
+            values[i++] = LSNGetDatum(min_keep_lsn);
+        }
+
         tuplestore_putvalues(tupstore, tupdesc, values, nulls);
     }
     LWLockRelease(ReplicationSlotControlLock);
diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h
index 12cd0d1d10..52e64f392d 100644
--- a/src/include/access/xlog.h
+++ b/src/include/access/xlog.h
@@ -269,6 +269,7 @@ extern void ShutdownXLOG(int code, Datum arg);
 extern void InitXLOGAccess(void);
 extern void CreateCheckPoint(int flags);
 extern bool CreateRestartPoint(int flags);
+extern bool IsLsnStillAvaiable(XLogRecPtr restartLSN, XLogRecPtr *minSecureLSN);
 extern void XLogPutNextOid(Oid nextOid);
 extern XLogRecPtr XLogRestorePoint(const char *rpName);
 extern void UpdateFullPageWrites(void);
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 40d54ed030..ef8f0eab91 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -9796,9 +9796,9 @@
   proname => 'pg_get_replication_slots', prorows => '10', proisstrict => 'f',
   proretset => 't', provolatile => 's', prorettype => 'record',
   proargtypes => '',
-  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn}',
-  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o}',
-  proargnames =>
'{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn}',
+  proallargtypes => '{name,name,text,oid,bool,bool,int4,xid,xid,pg_lsn,pg_lsn,text,pg_lsn}',
+  proargmodes => '{o,o,o,o,o,o,o,o,o,o,o,o,o}',
+  proargnames =>
'{slot_name,plugin,slot_type,datoid,temporary,active,active_pid,xmin,catalog_xmin,restart_lsn,confirmed_flush_lsn,wal_status,min_keep_lsn}',
   prosrc => 'pg_get_replication_slots' },
 { oid => '3786', descr => 'set up a logical replication slot',
   proname => 'pg_create_logical_replication_slot', provolatile => 'v',
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index ae0cd253d5..93f6bff77e 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1451,8 +1451,10 @@ pg_replication_slots| SELECT l.slot_name,
     l.xmin,
     l.catalog_xmin,
     l.restart_lsn,
-    l.confirmed_flush_lsn
-   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin,
catalog_xmin,restart_lsn, confirmed_flush_lsn)
 
+    l.confirmed_flush_lsn,
+    l.wal_status,
+    l.min_keep_lsn
+   FROM (pg_get_replication_slots() l(slot_name, plugin, slot_type, datoid, temporary, active, active_pid, xmin,
catalog_xmin,restart_lsn, confirmed_flush_lsn, wal_status, min_keep_lsn)
 
      LEFT JOIN pg_database d ON ((l.datoid = d.oid)));
 pg_roles| SELECT pg_authid.rolname,
     pg_authid.rolsuper,
-- 
2.16.3

From 6a2bd69eba11b5aa611c3194711e2039744a624c Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 21 Dec 2017 17:33:53 +0900
Subject: [PATCH 3/4] TAP test for the slot limit feature

---
 src/test/recovery/t/016_replslot_limit.pl | 161 ++++++++++++++++++++++++++++++
 1 file changed, 161 insertions(+)
 create mode 100644 src/test/recovery/t/016_replslot_limit.pl

diff --git a/src/test/recovery/t/016_replslot_limit.pl b/src/test/recovery/t/016_replslot_limit.pl
new file mode 100644
index 0000000000..05a1113a67
--- /dev/null
+++ b/src/test/recovery/t/016_replslot_limit.pl
@@ -0,0 +1,161 @@
+# Test for replication slot limit
+# Ensure that max_slot_wal_keep_size limits the number of WAL files to
+# be kept by replication slot.
+
+use strict;
+use warnings;
+use File::Path qw(rmtree);
+use PostgresNode;
+use TestLib;
+use Test::More tests => 7;
+use Time::HiRes qw(usleep);
+
+$ENV{PGDATABASE} = 'postgres';
+
+# Initialize master node
+my $node_master = get_new_node('master');
+$node_master->init(allows_streaming => 1);
+$node_master->append_conf('postgresql.conf', qq(
+min_wal_size = 32MB
+max_wal_size = 48MB
+));
+$node_master->start;
+$node_master->safe_psql('postgres', "SELECT pg_create_physical_replication_slot('rep1')");
+
+
+# Take backup
+my $backup_name = 'my_backup';
+$node_master->backup($backup_name);
+
+# Create a standby linking to it using a replication slot
+my $node_standby = get_new_node('standby_1');
+$node_standby->init_from_backup($node_master, $backup_name, has_streaming => 1, primary_slot_name => 'rep1');
+$node_standby->append_conf('recovery.conf', qq(
+primary_slot_name = 'rep1'
+));
+$node_standby->start;
+
+# Wait until standby has replayed enough data on the standby
+my $start_lsn = $node_master->lsn('write');
+$node_master->wait_for_catchup($node_standby, 'replay', $start_lsn);
+
+# Stop standby
+$node_standby->stop;
+
+
+# Preparation done, currently the slot must be secured.
+my $result = $node_master->safe_psql('postgres', "SELECT restart_lsn, wal_status, min_keep_lsn FROM
pg_replication_slotsWHERE slot_name = 'rep1'");
 
+is($result, "$start_lsn|streaming|$start_lsn", 'check initial state of standby');
+
+# Advance WAL by ten segments (= 160MB) on master
+advance_wal($node_master, 10);
+
+# All segments still must be secured after a checkpoint.
+$result = $node_master->safe_psql('postgres', "SELECT restart_lsn, wal_status, min_keep_lsn FROM pg_replication_slots
WHEREslot_name = 'rep1'");
 
+is($result, "$start_lsn|streaming|$start_lsn", 'check that slot is keeping all segments');
+
+# The stanby can connect master
+$node_standby->start;
+
+$start_lsn = $node_master->lsn('write');
+$node_master->wait_for_catchup($node_standby, 'replay', $start_lsn);
+
+$node_standby->stop;
+
+
+# Advance WAL again
+advance_wal($node_master, 10);
+
+# Set max_slot_wal_keep_size on master
+my $max_slot_wal_keep_size_mb = 32;
+$node_master->append_conf('postgresql.conf', qq(
+max_slot_wal_keep_size = ${max_slot_wal_keep_size_mb}MB
+));
+$node_master->reload;
+
+# Some segments become 'insecured'
+$result = $node_master->safe_psql('postgres', "SELECT restart_lsn, wal_status FROM pg_replication_slots WHERE
slot_name= 'rep1'");
 
+is($result, "$start_lsn|keeping", 'check that some segments are about to removed');
+
+# The stanby still can connect master
+$node_standby->start;
+
+$start_lsn = $node_master->lsn('write');
+$node_master->wait_for_catchup($node_standby, 'replay', $start_lsn);
+
+$node_standby->stop;
+
+ok(!find_in_log($node_standby,
+                "requested WAL segment [0-9A-F]+ has already been removed"),
+   'check that no replication failure is caused by insecure state');
+
+# Advance WAL again
+my $logstart = get_log_size($node_master);
+advance_wal($node_master, 10);
+
+# WARNING should be issued
+ok(find_in_log($node_master,
+               "some replication slots have lost required WAL segments",
+               $logstart),
+   'check that the warning is correctly logged');
+
+# This slot should be broken
+$result = $node_master->safe_psql('postgres', "SELECT restart_lsn, wal_status FROM pg_replication_slots WHERE
slot_name= 'rep1'");
 
+is($result, "$start_lsn|lost", 'check that overflown segments have been removed');
+
+# The stanby no longer can connect to the master
+$logstart = get_log_size($node_standby);
+$node_standby->start;
+
+my $failed = 0;
+for (my $i = 0 ; $i < 10000 ; $i++)
+{
+    if (find_in_log($node_standby,
+                    "requested WAL segment [0-9A-F]+ has already been removed",
+                    $logstart))
+    {
+        $failed = 1;
+        last;
+    }
+    usleep(100_000);
+}
+ok($failed, 'check replication has been broken');
+
+$node_standby->stop;
+
+#####################################
+# Advance WAL of $node by $n segments
+sub advance_wal
+{
+    my ($node, $n) = @_;
+
+    # Advance by $n segments (= (16 * $n) MB) on master
+    for (my $i = 0 ; $i < $n ; $i++)
+    {
+        $node->safe_psql('postgres', "CREATE TABLE t (a int); DROP TABLE t; SELECT pg_switch_wal();");
+    }
+
+    $node->safe_psql('postgres', "CHECKPOINT;");
+}
+
+# return the size of logfile of $node in bytes
+sub get_log_size
+{
+    my ($node) = @_;
+
+    return (stat $node->logfile)[7];
+}
+
+# find $pat in logfile of $node after $off-th byte
+sub find_in_log
+{
+    my ($node, $pat, $off) = @_;
+
+    $off = 0 unless defined $off;
+    my $log = TestLib::slurp_file($node->logfile);
+    return 0 if (length($log) <= $off);
+
+    $log = substr($log, $off);
+
+    return $log =~ m/$pat/;
+}
-- 
2.16.3

From 04687424be04ca31de4ddd8707d159cb25a7e380 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 11 Jan 2018 15:00:32 +0900
Subject: [PATCH 4/4] Documentation for slot-limit feature

---
 doc/src/sgml/catalogs.sgml          | 29 +++++++++++++++++++++++++++++
 doc/src/sgml/config.sgml            | 22 ++++++++++++++++++++++
 doc/src/sgml/high-availability.sgml | 14 ++++++++------
 3 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 3ed9021c2f..3d5bc666b9 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -9881,6 +9881,35 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
       </entry>
      </row>
 
+     <row>
+      <entry><structfield>wal_status</structfield></entry>
+      <entry><type>text</type></entry>
+      <entry></entry>
+
+      <entry>Availability of WAL records claimed by the
+      slot. <literal>streaming</literal>, <literal>keeping</literal>,
+      <literal>lost</literal>
+      or <literal>unknown</literal>. <literal>streaming</literal> means that
+      the claimed records are available. <literal>keeping</literal> means that
+      some of them are to be removed by the next checkpoint.
+      <literal>lost</literal> means that some of them have been removed. The
+      last two states are seen only when
+      <xref linkend="guc-max-slot-wal-keep-size"/> is not zero. If the slot
+      doesn't have valid restart_lsn, this field
+      is <literal>unknown</literal>.
+      </entry>
+     </row>
+
+     <row>
+      <entry><structfield>min_keep_lsn</structfield></entry>
+      <entry><type>pg_lsn</type></entry>
+      <entry></entry>
+      <entry>The address (<literal>LSN</literal>) back to which is available
+      to the replication slot. The user of the slot can no longer continue
+      streaming if this exceeds restart_lsn.
+      </entry>
+     </row>
+
     </tbody>
    </tgroup>
   </table>
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5b913f00c1..33a623729f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3116,6 +3116,28 @@ include_dir 'conf.d'
        </listitem>
       </varlistentry>
 
+      <varlistentry id="guc-max-slot-wal-keep-size" xreflabel="max_slot_wal_keep_size">
+       <term><varname>max_slot_wal_keep_size</varname> (<type>integer</type>)
+       <indexterm>
+        <primary><varname>max_slot_wal_keep_size</varname> configuration parameter</primary>
+       </indexterm>
+       </term>
+       <listitem>
+       <para>
+        Specify the maximum size of WAL files
+        that <link linkend="streaming-replication-slots">replication
+        slots</link> are allowed to reatin in the <filename>pg_wal</filename>
+        directory at checkpoint time.
+        If <varname>max_slot_wal_keep_size</varname> is zero (the default),
+        replication slots retain unlimited size of WAL files.
+       </para>
+       <para>
+        This parameter is used being rounded down to the multiples of WAL file
+        size.
+       </para>
+       </listitem>
+      </varlistentry>
+
      <varlistentry id="guc-wal-sender-timeout" xreflabel="wal_sender_timeout">
       <term><varname>wal_sender_timeout</varname> (<type>integer</type>)
       <indexterm>
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 934eb9052d..50ebb23c23 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -927,9 +927,11 @@ primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
     <xref linkend="guc-archive-command"/>.
     However, these methods often result in retaining more WAL segments than
     required, whereas replication slots retain only the number of segments
-    known to be needed.  An advantage of these methods is that they bound
-    the space requirement for <literal>pg_wal</literal>; there is currently no way
-    to do this using replication slots.
+    known to be needed.  On the other hand, replication slots can retain so
+    many WAL segments that they fill up the space allotted
+    for <literal>pg_wal</literal>;
+    <xref linkend="guc-max-slot-wal-keep-size"/> limits the size of WAL files
+    retained by replication slots.
    </para>
    <para>
     Similarly, <xref linkend="guc-hot-standby-feedback"/>
@@ -967,9 +969,9 @@ postgres=# SELECT * FROM pg_create_physical_replication_slot('node_a_slot');
  node_a_slot |
 
 postgres=# SELECT * FROM pg_replication_slots;
-  slot_name  | slot_type | datoid | database | active | xmin | restart_lsn | confirmed_flush_lsn
--------------+-----------+--------+----------+--------+------+-------------+---------------------
- node_a_slot | physical  |        |          | f      |      |             |
+  slot_name  | slot_type | datoid | database | active | xmin | restart_lsn | confirmed_flush_lsn | wal_status |
min_keep_lsn

+-------------+-----------+--------+----------+--------+------+-------------+---------------------+------------+--------------
+ node_a_slot | physical  |        |          | f      |      |             |                     | unknown    |
0/1000000
 (1 row)
 </programlisting>
      To configure the standby to use this slot, <varname>primary_slot_name</varname>
-- 
2.16.3


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Haribabu Kommi
Дата:
Сообщение: Re: New function pg_stat_statements_reset_query() to reset statisticsof a specific query
Следующее
От: Haribabu Kommi
Дата:
Сообщение: Re: New function pg_stat_statements_reset_query() to reset statisticsof a specific query