Re: Protect syscache from bloating with negative cache entries

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: Protect syscache from bloating with negative cache entries
Дата
Msg-id 20190329.172440.199616830.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на RE: Protect syscache from bloating with negative cache entries  ("Ideriha, Takeshi" <ideriha.takeshi@jp.fujitsu.com>)
Ответы Re: Protect syscache from bloating with negative cache entries
Список pgsql-hackers
Hello. Sorry for being late a bit.

At Wed, 27 Mar 2019 17:30:37 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20190327.173037.40342566.horiguchi.kyotaro@lab.ntt.co.jp>
> > I don't see much point in continuing to review this patch at this
> > point.  There's been no new version of the patch in 3 weeks, and there
> > is -- in my view at least -- a rather frustrating lack of evidence
> > that the complexity this patch introduces is actually beneficial.  No
> > matter how many people +1 the idea of making this more complicated, it
> > can't be justified unless you can provide a test result showing that
> > the additional complexity solves a problem that does not get solved
> > without that complexity.  And even then, who is going to commit a
> > patch that uses a design which Tom Lane says was tried before and
> > stunk?
> 
> Hmm. Anyway it is hit by recent commit. I'll post a rebased
> version and a version reverted to do hole-scan. Then I'll take
> numbers as far as I can and will show the result.. tomorrow.

I took performance numbers for master and three versions of the
patch. Master, LRU, full-scan, modified full-scan. I noticed that
useless scan can be skipped in full-scan version so I added the
last versoin.

I ran three artificial test cases. The database is created by
gen_tbl.pl. Numbers are the average of the fastest five runs in
successive 15 runs.

Test cases are listed below.

1_0. About 3,000,000 negative entries are created in pg_statstic
  cache by scanning that many distinct columns. It is 3000 tables
  * 1001 columns. Pruning scans happen several times while a run
  but no entries are removed. This emulates the bloating phase of
  cache. catalog_cache_prune_min_age is default (300s).
  (access_tbl1.pl)

1_1. Same to 1_0 except that catalog_cache_prune_min_age is 0,
  which means turning off.

2_0. Repeatedly access 1001 of the 3,000,000 entries 6000
  times. This emulates the stable cache case without having
  pruning. catalog_cache_prune_min_age is default (300s).
 (access_tbl2.pl)

2_1. Same to 2_0 except that catalog_cache_prune_min_age is 0,
  which means turning off.

3_0. Scan over the 3,000,000 entries twice with setting prune_age
  to 10s. A run takes about 18 seconds on my box so fair amount
  of old entries are removed. This emulates the stable case with
  continuous pruning. (access_tbl3.pl)

2_1. Same to 3_0 except that catalog_cache_prune_min_age is 0,
  which means turning off.


The result follows.

     | master |  LRU   |  Full  |Full-mod|
-----|--------+--------+--------+--------+
 1_0 | 17.287 | 17.370 | 17.255 | 16.623 |
 1_1 | 17.287 | 17.063 | 16.336 | 17.192 |
 2_0 | 15.695 | 18.769 | 18.563 | 15.527 |
 2_1 | 15.695 | 18.603 | 18.498 | 18.487 |
 3_0 | 26.576 | 33.817 | 34.384 | 34.971 |
 3_1 | 26.576 | 27.462 | 26.202 | 26.368 |

The result of 2_0 and 2_1 seems strange, but I show you the
numbers at the present.

- Full-scan seems to have the smallest impact when turned off.

- Full-scan-mod seems to perform best in total. (as far as
  Full-mod-2_0 is wrong value..)

- LRU doesn't seem to outperform full scanning.

For your information I measured how long pruning takes time.

LRU        318318 out of 2097153 entries in 26ms:  0.08us/entry.
Full-scan  443443 out of 2097153 entreis in 184ms. 0.4us/entry.

LRU is actually fast to remove entries but the difference seems
to be canceled by the complexity of LRU maintenance.

As my conclusion, we should go with the Full-scan or
Full-scan-mod version. I conduct a further overnight test and
will see which is better.

I attached the test script set. It is used in the folling manner.

(start server)
# perl gen_tbl.pl | psql postgres
(stop server)
# sh run.sh 30 > log.txt   # 30 is repeat count
# perl process.pl
     | master |  LRU   |  Full  |Full-mod|
-----|--------+--------+--------+--------+
 1_0 | 16.711 | 17.647 | 16.767 | 17.256 |
...


The attached files are follow.

LRU versions patches.
  LRU-0001-Add-dlist_move_tail.patch
  LRU-0002-Remove-entries-that-haven-t-been-used-for-a-certain-.patch

Fullscn version patch.
  FullScan-0001-Remove-entries-that-haven-t-been-used-for-a-certain-.patch

Fullscn-mod version patch.
  FullScan-mod-0001-Remove-entries-that-haven-t-been-used-for-a-certain-.patch

test scripts.
  test_script.tar.gz


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From 1c397d118a65d6b76282cc904c43ecfe97ee5329 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Thu, 7 Feb 2019 14:56:07 +0900
Subject: [PATCH 1/2] Add dlist_move_tail

We have dlist_push_head/tail and dlist_move_head but not
dlist_move_tail. Add it.
---
 src/include/lib/ilist.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/src/include/lib/ilist.h b/src/include/lib/ilist.h
index b1a5974ee4..659ab1ac87 100644
--- a/src/include/lib/ilist.h
+++ b/src/include/lib/ilist.h
@@ -394,6 +394,25 @@ dlist_move_head(dlist_head *head, dlist_node *node)
     dlist_check(head);
 }
 
+/*
+ * Move element from its current position in the list to the tail position in
+ * the same list.
+ *
+ * Undefined behaviour if 'node' is not already part of the list.
+ */
+static inline void
+dlist_move_tail(dlist_head *head, dlist_node *node)
+{
+    /* fast path if it's already at the tail */
+    if (head->head.prev == node)
+        return;
+
+    dlist_delete(node);
+    dlist_push_tail(head, node);
+
+    dlist_check(head);
+}
+
 /*
  * Check whether 'node' has a following node.
  * Caution: unreliable if 'node' is not in the list.
-- 
2.16.3

From f7a132c2b4910908773c508ef356a07cc853fe79 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 1 Mar 2019 13:32:51 +0900
Subject: [PATCH 2/2] Remove entries that haven't been used for a certain time

Catcache entries happen to be left alone for several reasons. It is
not desirable that such useless entries eat up memory. Catcache
pruning feature removes entries that haven't been accessed for a
certain time before enlarging hash array.
---
 doc/src/sgml/config.sgml                      |  19 ++++
 src/backend/tcop/postgres.c                   |   2 +
 src/backend/utils/cache/catcache.c            | 122 +++++++++++++++++++++++++-
 src/backend/utils/misc/guc.c                  |  12 +++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/utils/catcache.h                  |  18 ++++
 6 files changed, 171 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d383de2512..4231235447 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1677,6 +1677,25 @@ include_dir 'conf.d'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-catalog-cache-prune-min-age" xreflabel="catalog_cache_prune_min_age">
+      <term><varname>catalog_cache_prune_min_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>catalog_cache_prune_min_age</varname> configuration
+       parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+         Specifies the minimum amount of unused time in seconds at which a
+         system catalog cache entry is removed. -1 indicates that this feature
+         is disabled at all. The value defaults to 300 seconds (<literal>5
+         minutes</literal>). The entries that are not used for the duration
+         can be removed to prevent catalog cache from bloating with useless
+         entries.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-max-stack-depth" xreflabel="max_stack_depth">
       <term><varname>max_stack_depth</varname> (<type>integer</type>)
       <indexterm>
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f9ce3d8f22..acab473d34 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -71,6 +71,7 @@
 #include "tcop/pquery.h"
 #include "tcop/tcopprot.h"
 #include "tcop/utility.h"
+#include "utils/catcache.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
@@ -2575,6 +2576,7 @@ start_xact_command(void)
      * not desired, the timeout has to be disabled explicitly.
      */
     enable_statement_timeout();
+    SetCatCacheClock(GetCurrentStatementStartTimestamp());
 }
 
 static void
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index d05930bc4c..c8ee0c98fb 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -38,6 +38,7 @@
 #include "utils/rel.h"
 #include "utils/resowner_private.h"
 #include "utils/syscache.h"
+#include "utils/timeout.h"
 
 
  /* #define CACHEDEBUG */    /* turns DEBUG elogs on */
@@ -60,9 +61,24 @@
 #define CACHE_elog(...)
 #endif
 
+/*
+ * GUC variable to define the minimum age of entries that will be considered
+ * to be evicted in seconds. -1 to disable the feature.
+ */
+int catalog_cache_prune_min_age = 300;
+
+/*
+ * Minimum interval between two successive moves of a cache entry in LRU list,
+ * in microseconds.
+ */
+#define MIN_LRU_UPDATE_INTERVAL 100000    /* 100ms */
+
 /* Cache management header --- pointer is NULL until created */
 static CatCacheHeader *CacheHdr = NULL;
 
+/* Clock for the last accessed time of a catcache entry. */
+TimestampTz    catcacheclock = 0;
+
 static inline HeapTuple SearchCatCacheInternal(CatCache *cache,
                        int nkeys,
                        Datum v1, Datum v2,
@@ -473,6 +489,7 @@ CatCacheRemoveCTup(CatCache *cache, CatCTup *ct)
 
     /* delink from linked list */
     dlist_delete(&ct->cache_elem);
+    dlist_delete(&ct->lru_node);
 
     /*
      * Free keys when we're dealing with a negative entry, normal entries just
@@ -833,6 +850,7 @@ InitCatCache(int id,
     cp->cc_nkeys = nkeys;
     for (i = 0; i < nkeys; ++i)
         cp->cc_keyno[i] = key[i];
+    dlist_init(&cp->cc_lru_list);
 
     /*
      * new cache is initialized as far as we can go for now. print some
@@ -850,9 +868,83 @@ InitCatCache(int id,
      */
     MemoryContextSwitchTo(oldcxt);
 
+    /* initialize catcache reference clock if haven't done yet */
+    if (catcacheclock == 0)
+        catcacheclock = GetCurrentTimestamp();
+
     return cp;
 }
 
+/*
+ * CatCacheCleanupOldEntries - Remove infrequently-used entries
+ *
+ * Catcache entries happen to be left unused for a long time for several
+ * reasons. Remove such entries to prevent catcache from bloating. It is based
+ * on the similar algorithm with buffer eviction. Entries that are accessed
+ * several times in a certain period live longer than those that have had less
+ * access in the same duration.
+ */
+static bool
+CatCacheCleanupOldEntries(CatCache *cp)
+{
+    int            nremoved = 0;
+    dlist_mutable_iter    iter;
+
+    /* Return immediately if disabled */
+    if (catalog_cache_prune_min_age == 0)
+        return false;
+
+    /* Scan over LRU to find entries to remove */
+    dlist_foreach_modify(iter, &cp->cc_lru_list)
+    {
+        CatCTup    *ct = dlist_container(CatCTup, lru_node, iter.cur);
+        long        entry_age;
+        int            us;
+
+        /* Don't remove referenced entries */
+        if (ct->refcount != 0 ||
+            (ct->c_list && ct->c_list->refcount != 0))
+            continue;
+
+        /*
+         * Calculate the duration from the time from the last access to
+         * the "current" time. catcacheclock is updated per-statement
+         * basis.
+         */
+        TimestampDifference(ct->lastaccess, catcacheclock, &entry_age, &us);
+
+        if (entry_age < catalog_cache_prune_min_age)
+        {
+            /*
+             * We don't have older entries, exit.  At least one removal
+             * prevents rehashing this time.
+             */
+            break;
+        }
+
+        /*
+         * Entries that are not accessed after the last pruning are removed in
+         * that seconds, and their lives are prolonged according to how many
+         * times they are accessed up to three times of the duration. We don't
+         * try shrink buckets since pruning effectively caps catcache
+         * expansion in the long term.
+         */
+        if (ct->naccess > 0)
+            ct->naccess--;
+        else
+        {
+            CatCacheRemoveCTup(cp, ct);
+            nremoved++;
+        }
+    }
+
+    if (nremoved > 0)
+        elog(DEBUG1, "pruning catalog cache id=%d for %s: removed %d / %d",
+             cp->id, cp->cc_relname, nremoved, cp->cc_ntup + nremoved);
+
+    return nremoved > 0;
+}
+
 /*
  * Enlarge a catcache, doubling the number of buckets.
  */
@@ -1264,6 +1356,20 @@ SearchCatCacheInternal(CatCache *cache,
          */
         dlist_move_head(bucket, &ct->cache_elem);
 
+        /* prolong life of this entry */
+        if (ct->naccess < 2)
+            ct->naccess++;
+
+        /*
+         * Don't update LRU too frequently. We need to maintain the LRU even
+         * if pruning is inactive since it can be turned on on-session.
+         */
+        if (catcacheclock - ct->lastaccess > MIN_LRU_UPDATE_INTERVAL)
+        {
+            ct->lastaccess = catcacheclock;
+            dlist_move_tail(&cache->cc_lru_list, &ct->lru_node);
+        }
+
         /*
          * If it's a positive entry, bump its refcount and return it. If it's
          * negative, we can report failure to the caller.
@@ -1888,19 +1994,29 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
     ct->dead = false;
     ct->negative = negative;
     ct->hash_value = hashValue;
+    ct->naccess = 0;
+    ct->lastaccess = catcacheclock;
+    dlist_push_tail(&cache->cc_lru_list, &ct->lru_node);
 
     dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
 
     cache->cc_ntup++;
     CacheHdr->ch_ntup++;
 
+    /* increase refcount so that the new entry survives pruning */
+    ct->refcount++;
+
     /*
-     * If the hash table has become too full, enlarge the buckets array. Quite
-     * arbitrarily, we enlarge when fill factor > 2.
+     * If the hash table has become too full, try removing infrequently used
+     * entries to make a room for the new entry. If failed, enlarge the bucket
+     * array instead.  Quite arbitrarily, we try this when fill factor > 2.
      */
-    if (cache->cc_ntup > cache->cc_nbuckets * 2)
+    if (cache->cc_ntup > cache->cc_nbuckets * 2 &&
+        !CatCacheCleanupOldEntries(cache))
         RehashCatCache(cache);
 
+    ct->refcount--;
+
     return ct;
 }
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index aa564d153a..e624c74bf9 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -82,6 +82,7 @@
 #include "tsearch/ts_cache.h"
 #include "utils/builtins.h"
 #include "utils/bytea.h"
+#include "utils/catcache.h"
 #include "utils/guc_tables.h"
 #include "utils/float.h"
 #include "utils/memutils.h"
@@ -2202,6 +2203,17 @@ static struct config_int ConfigureNamesInt[] =
         NULL, NULL, NULL
     },
 
+    {
+        {"catalog_cache_prune_min_age", PGC_USERSET, RESOURCES_MEM,
+            gettext_noop("System catalog cache entries that live unused for longer than this seconds are considered
forremoval."),
 
+            gettext_noop("The value of -1 turns off pruning."),
+            GUC_UNIT_S
+        },
+        &catalog_cache_prune_min_age,
+        300, -1, INT_MAX,
+        NULL, NULL, NULL
+    },
+
     /*
      * We use the hopefully-safely-small value of 100kB as the compiled-in
      * default for max_stack_depth.  InitializeGUCOptions will increase it if
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index cccb5f145a..fa117f0573 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -128,6 +128,7 @@
 #work_mem = 4MB                # min 64kB
 #maintenance_work_mem = 64MB        # min 1MB
 #autovacuum_work_mem = -1        # min 1MB, or -1 to use maintenance_work_mem
+#catalog_cache_prune_min_age = 300s    # -1 disables pruning
 #max_stack_depth = 2MB            # min 100kB
 #shared_memory_type = mmap        # the default is the first option
                     # supported by the operating system:
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index 65d816a583..a21c53644a 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -22,6 +22,7 @@
 
 #include "access/htup.h"
 #include "access/skey.h"
+#include "datatype/timestamp.h"
 #include "lib/ilist.h"
 #include "utils/relcache.h"
 
@@ -61,6 +62,7 @@ typedef struct catcache
     slist_node    cc_next;        /* list link */
     ScanKeyData cc_skey[CATCACHE_MAXKEYS];    /* precomputed key info for heap
                                              * scans */
+    dlist_head    cc_lru_list;
 
     /*
      * Keep these at the end, so that compiling catcache.c with CATCACHE_STATS
@@ -119,6 +121,9 @@ typedef struct catctup
     bool        dead;            /* dead but not yet removed? */
     bool        negative;        /* negative cache entry? */
     HeapTupleData tuple;        /* tuple management header */
+    int            naccess;        /* # of access to this entry, up to 2  */
+    TimestampTz    lastaccess;        /* timestamp of the last usage */
+    dlist_node    lru_node;        /* LRU node */
 
     /*
      * The tuple may also be a member of at most one CatCList.  (If a single
@@ -189,6 +194,19 @@ typedef struct catcacheheader
 /* this extern duplicates utils/memutils.h... */
 extern PGDLLIMPORT MemoryContext CacheMemoryContext;
 
+/* for guc.c, not PGDLLPMPORT'ed */
+extern int catalog_cache_prune_min_age;
+
+/* source clock for access timestamp of catcache entries */
+extern TimestampTz catcacheclock;
+
+/* SetCatCacheClock - set catcache timestamp source clodk */
+static inline void
+SetCatCacheClock(TimestampTz ts)
+{
+    catcacheclock = ts;
+}
+
 extern void CreateCacheMemoryContext(void);
 
 extern CatCache *InitCatCache(int id, Oid reloid, Oid indexoid,
-- 
2.16.3

From 0a6691078f8af5f35cca194137768af7d08fa1d8 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 1 Mar 2019 13:32:51 +0900
Subject: [PATCH] Remove entries that haven't been used for a certain time

Catcache entries happen to be left alone for several reasons. It is
not desirable that such useless entries eat up memory. Catcache
pruning feature removes entries that haven't been accessed for a
certain time before enlarging hash array.
---
 doc/src/sgml/config.sgml                      |  19 +++++
 src/backend/tcop/postgres.c                   |   2 +
 src/backend/utils/cache/catcache.c            | 105 +++++++++++++++++++++++++-
 src/backend/utils/misc/guc.c                  |  12 +++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/utils/catcache.h                  |  16 ++++
 6 files changed, 152 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d383de2512..4231235447 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1677,6 +1677,25 @@ include_dir 'conf.d'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-catalog-cache-prune-min-age" xreflabel="catalog_cache_prune_min_age">
+      <term><varname>catalog_cache_prune_min_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>catalog_cache_prune_min_age</varname> configuration
+       parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+         Specifies the minimum amount of unused time in seconds at which a
+         system catalog cache entry is removed. -1 indicates that this feature
+         is disabled at all. The value defaults to 300 seconds (<literal>5
+         minutes</literal>). The entries that are not used for the duration
+         can be removed to prevent catalog cache from bloating with useless
+         entries.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-max-stack-depth" xreflabel="max_stack_depth">
       <term><varname>max_stack_depth</varname> (<type>integer</type>)
       <indexterm>
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f9ce3d8f22..acab473d34 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -71,6 +71,7 @@
 #include "tcop/pquery.h"
 #include "tcop/tcopprot.h"
 #include "tcop/utility.h"
+#include "utils/catcache.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
@@ -2575,6 +2576,7 @@ start_xact_command(void)
      * not desired, the timeout has to be disabled explicitly.
      */
     enable_statement_timeout();
+    SetCatCacheClock(GetCurrentStatementStartTimestamp());
 }
 
 static void
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index d05930bc4c..52586bd415 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -38,6 +38,7 @@
 #include "utils/rel.h"
 #include "utils/resowner_private.h"
 #include "utils/syscache.h"
+#include "utils/timeout.h"
 
 
  /* #define CACHEDEBUG */    /* turns DEBUG elogs on */
@@ -60,9 +61,18 @@
 #define CACHE_elog(...)
 #endif
 
+/*
+ * GUC variable to define the minimum age of entries that will be considered
+ * to be evicted in seconds. -1 to disable the feature.
+ */
+int catalog_cache_prune_min_age = 300;
+
 /* Cache management header --- pointer is NULL until created */
 static CatCacheHeader *CacheHdr = NULL;
 
+/* Clock for the last accessed time of a catcache entry. */
+TimestampTz    catcacheclock = 0;
+
 static inline HeapTuple SearchCatCacheInternal(CatCache *cache,
                        int nkeys,
                        Datum v1, Datum v2,
@@ -850,9 +860,83 @@ InitCatCache(int id,
      */
     MemoryContextSwitchTo(oldcxt);
 
+    /* initialize catcache reference clock if haven't done yet */
+    if (catcacheclock == 0)
+        catcacheclock = GetCurrentTimestamp();
+
     return cp;
 }
 
+/*
+ * CatCacheCleanupOldEntries - Remove infrequently-used entries
+ *
+ * Catcache entries happen to be left unused for a long time for several
+ * reasons. Remove such entries to prevent catcache from bloating. It is based
+ * on the similar algorithm with buffer eviction. Entries that are accessed
+ * several times in a certain period live longer than those that have had less
+ * access in the same duration.
+ */
+static bool
+CatCacheCleanupOldEntries(CatCache *cp)
+{
+    int    nremoved = 0;
+    int i;
+
+    /* Return immediately if disabled */
+    if (catalog_cache_prune_min_age == 0)
+        return false;
+
+    /* Scan over the whole hash to find entries to remove */
+    for (i = 0 ; i < cp->cc_nbuckets ; i++)
+    {
+        dlist_mutable_iter    iter;
+
+        dlist_foreach_modify(iter, &cp->cc_bucket[i])
+        {
+            CatCTup    *ct = dlist_container(CatCTup, cache_elem, iter.cur);
+            long        entry_age;
+            int            us;
+
+            /* Don't remove referenced entries */
+            if (ct->refcount != 0 ||
+                (ct->c_list && ct->c_list->refcount != 0))
+                continue;
+
+            /*
+             * Calculate the duration from the time from the last access to
+             * the "current" time. catcacheclock is updated per-statement
+             * basis and additionaly udpated periodically during a long
+             * running query.
+             */
+            TimestampDifference(ct->lastaccess, catcacheclock, &entry_age, &us);
+
+            if (entry_age < catalog_cache_prune_min_age)
+                continue;
+
+            /*
+             * Entries that are not accessed after the last pruning are
+             * removed in that seconds, and their lives are prolonged
+             * according to how many times they are accessed up to three times
+             * of the duration. We don't try shrink buckets since pruning
+             * effectively caps catcache expansion in the long term.
+             */
+            if (ct->naccess > 0)
+                ct->naccess--;
+            else
+            {
+                CatCacheRemoveCTup(cp, ct);
+                nremoved++;
+            }
+        }
+    }
+
+    if (nremoved > 0)
+        elog(DEBUG1, "pruning catalog cache id=%d for %s: removed %d / %d",
+             cp->id, cp->cc_relname, nremoved, cp->cc_ntup + nremoved);
+
+    return nremoved > 0;
+}
+
 /*
  * Enlarge a catcache, doubling the number of buckets.
  */
@@ -1264,6 +1348,12 @@ SearchCatCacheInternal(CatCache *cache,
          */
         dlist_move_head(bucket, &ct->cache_elem);
 
+        /* prolong life of this entry */
+        if (ct->naccess < 2)
+            ct->naccess++;
+
+        ct->lastaccess = catcacheclock;
+
         /*
          * If it's a positive entry, bump its refcount and return it. If it's
          * negative, we can report failure to the caller.
@@ -1888,19 +1978,28 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
     ct->dead = false;
     ct->negative = negative;
     ct->hash_value = hashValue;
+    ct->naccess = 0;
+    ct->lastaccess = catcacheclock;
 
     dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
 
     cache->cc_ntup++;
     CacheHdr->ch_ntup++;
 
+    /* increase refcount so that the new entry survives pruning */
+    ct->refcount++;
+
     /*
-     * If the hash table has become too full, enlarge the buckets array. Quite
-     * arbitrarily, we enlarge when fill factor > 2.
+     * If the hash table has become too full, try removing infrequently used
+     * entries to make a room for the new entry. If failed, enlarge the bucket
+     * array instead.  Quite arbitrarily, we try this when fill factor > 2.
      */
-    if (cache->cc_ntup > cache->cc_nbuckets * 2)
+    if (cache->cc_ntup > cache->cc_nbuckets * 2 &&
+        !CatCacheCleanupOldEntries(cache))
         RehashCatCache(cache);
 
+    ct->refcount--;
+
     return ct;
 }
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index aa564d153a..e624c74bf9 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -82,6 +82,7 @@
 #include "tsearch/ts_cache.h"
 #include "utils/builtins.h"
 #include "utils/bytea.h"
+#include "utils/catcache.h"
 #include "utils/guc_tables.h"
 #include "utils/float.h"
 #include "utils/memutils.h"
@@ -2202,6 +2203,17 @@ static struct config_int ConfigureNamesInt[] =
         NULL, NULL, NULL
     },
 
+    {
+        {"catalog_cache_prune_min_age", PGC_USERSET, RESOURCES_MEM,
+            gettext_noop("System catalog cache entries that live unused for longer than this seconds are considered
forremoval."),
 
+            gettext_noop("The value of -1 turns off pruning."),
+            GUC_UNIT_S
+        },
+        &catalog_cache_prune_min_age,
+        300, -1, INT_MAX,
+        NULL, NULL, NULL
+    },
+
     /*
      * We use the hopefully-safely-small value of 100kB as the compiled-in
      * default for max_stack_depth.  InitializeGUCOptions will increase it if
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index cccb5f145a..fa117f0573 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -128,6 +128,7 @@
 #work_mem = 4MB                # min 64kB
 #maintenance_work_mem = 64MB        # min 1MB
 #autovacuum_work_mem = -1        # min 1MB, or -1 to use maintenance_work_mem
+#catalog_cache_prune_min_age = 300s    # -1 disables pruning
 #max_stack_depth = 2MB            # min 100kB
 #shared_memory_type = mmap        # the default is the first option
                     # supported by the operating system:
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index 65d816a583..2134839ecf 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -22,6 +22,7 @@
 
 #include "access/htup.h"
 #include "access/skey.h"
+#include "datatype/timestamp.h"
 #include "lib/ilist.h"
 #include "utils/relcache.h"
 
@@ -119,6 +120,8 @@ typedef struct catctup
     bool        dead;            /* dead but not yet removed? */
     bool        negative;        /* negative cache entry? */
     HeapTupleData tuple;        /* tuple management header */
+    int            naccess;        /* # of access to this entry, up to 2  */
+    TimestampTz    lastaccess;        /* timestamp of the last usage */
 
     /*
      * The tuple may also be a member of at most one CatCList.  (If a single
@@ -189,6 +192,19 @@ typedef struct catcacheheader
 /* this extern duplicates utils/memutils.h... */
 extern PGDLLIMPORT MemoryContext CacheMemoryContext;
 
+/* for guc.c, not PGDLLPMPORT'ed */
+extern int catalog_cache_prune_min_age;
+
+/* source clock for access timestamp of catcache entries */
+extern TimestampTz catcacheclock;
+
+/* SetCatCacheClock - set catcache timestamp source clodk */
+static inline void
+SetCatCacheClock(TimestampTz ts)
+{
+    catcacheclock = ts;
+}
+
 extern void CreateCacheMemoryContext(void);
 
 extern CatCache *InitCatCache(int id, Oid reloid, Oid indexoid,
-- 
2.16.3

From f2379fb8070420ea0880cfa74439744ade41dc3f Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Fri, 1 Mar 2019 13:32:51 +0900
Subject: [PATCH] Remove entries that haven't been used for a certain time

Catcache entries happen to be left alone for several reasons. It is
not desirable that such useless entries eat up memory. Catcache
pruning feature removes entries that haven't been accessed for a
certain time before enlarging hash array.
---
 doc/src/sgml/config.sgml                      |  19 ++++
 src/backend/tcop/postgres.c                   |   2 +
 src/backend/utils/cache/catcache.c            | 121 +++++++++++++++++++++++++-
 src/backend/utils/misc/guc.c                  |  12 +++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/utils/catcache.h                  |  17 ++++
 6 files changed, 169 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index d383de2512..4231235447 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1677,6 +1677,25 @@ include_dir 'conf.d'
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-catalog-cache-prune-min-age" xreflabel="catalog_cache_prune_min_age">
+      <term><varname>catalog_cache_prune_min_age</varname> (<type>integer</type>)
+      <indexterm>
+       <primary><varname>catalog_cache_prune_min_age</varname> configuration
+       parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+         Specifies the minimum amount of unused time in seconds at which a
+         system catalog cache entry is removed. -1 indicates that this feature
+         is disabled at all. The value defaults to 300 seconds (<literal>5
+         minutes</literal>). The entries that are not used for the duration
+         can be removed to prevent catalog cache from bloating with useless
+         entries.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-max-stack-depth" xreflabel="max_stack_depth">
       <term><varname>max_stack_depth</varname> (<type>integer</type>)
       <indexterm>
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index f9ce3d8f22..acab473d34 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -71,6 +71,7 @@
 #include "tcop/pquery.h"
 #include "tcop/tcopprot.h"
 #include "tcop/utility.h"
+#include "utils/catcache.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
@@ -2575,6 +2576,7 @@ start_xact_command(void)
      * not desired, the timeout has to be disabled explicitly.
      */
     enable_statement_timeout();
+    SetCatCacheClock(GetCurrentStatementStartTimestamp());
 }
 
 static void
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index d05930bc4c..c4582fe5a3 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -38,6 +38,7 @@
 #include "utils/rel.h"
 #include "utils/resowner_private.h"
 #include "utils/syscache.h"
+#include "utils/timeout.h"
 
 
  /* #define CACHEDEBUG */    /* turns DEBUG elogs on */
@@ -60,9 +61,18 @@
 #define CACHE_elog(...)
 #endif
 
+/*
+ * GUC variable to define the minimum age of entries that will be considered
+ * to be evicted in seconds. -1 to disable the feature.
+ */
+int catalog_cache_prune_min_age = 300;
+
 /* Cache management header --- pointer is NULL until created */
 static CatCacheHeader *CacheHdr = NULL;
 
+/* Clock for the last accessed time of a catcache entry. */
+TimestampTz    catcacheclock = 0;
+
 static inline HeapTuple SearchCatCacheInternal(CatCache *cache,
                        int nkeys,
                        Datum v1, Datum v2,
@@ -850,9 +860,99 @@ InitCatCache(int id,
      */
     MemoryContextSwitchTo(oldcxt);
 
+    /* initialize catcache reference clock if haven't done yet */
+    if (catcacheclock == 0)
+        catcacheclock = GetCurrentTimestamp();
+
     return cp;
 }
 
+/*
+ * CatCacheCleanupOldEntries - Remove infrequently-used entries
+ *
+ * Catcache entries happen to be left unused for a long time for several
+ * reasons. Remove such entries to prevent catcache from bloating. It is based
+ * on the similar algorithm with buffer eviction. Entries that are accessed
+ * several times in a certain period live longer than those that have had less
+ * access in the same duration.
+ */
+static bool
+CatCacheCleanupOldEntries(CatCache *cp)
+{
+    int        nremoved = 0;
+    int        i;
+    long    oldest_ts = catcacheclock;
+    long    age;
+    int        us;
+
+    /* Return immediately if disabled */
+    if (catalog_cache_prune_min_age == 0)
+        return false;
+
+    /* Don't scan the hash when we know we don't have prunable entries */
+    TimestampDifference(cp->cc_oldest_ts, catcacheclock, &age, &us);
+    if (age < catalog_cache_prune_min_age)
+        return false;
+
+    /* Scan over the whole hash to find entries to remove */
+    for (i = 0 ; i < cp->cc_nbuckets ; i++)
+    {
+        dlist_mutable_iter    iter;
+
+        dlist_foreach_modify(iter, &cp->cc_bucket[i])
+        {
+            CatCTup    *ct = dlist_container(CatCTup, cache_elem, iter.cur);
+
+            /* Don't remove referenced entries */
+            if (ct->refcount == 0 &&
+                (ct->c_list == NULL || ct->c_list->refcount == 0))
+            {
+                /*
+                 * Calculate the duration from the time from the last access
+                 * to the "current" time. catcacheclock is updated
+                 * per-statement basis and additionaly udpated periodically
+                 * during a long running query.
+                 */
+                TimestampDifference(ct->lastaccess, catcacheclock, &age, &us);
+
+                if (age >= catalog_cache_prune_min_age)
+                {
+                    /*
+                     * Entries that are not accessed after the last pruning
+                     * are removed in that seconds, and their lives are
+                     * prolonged according to how many times they are accessed
+                     * up to three times of the duration. We don't try shrink
+                     * buckets since pruning effectively caps catcache
+                     * expansion in the long term.
+                     */
+                    if (ct->naccess > 0)
+                        ct->naccess--;
+                    else
+                    {
+                        CatCacheRemoveCTup(cp, ct);
+                        nremoved++;
+
+                        /* don't update oldest_ts by removed entry */
+                        continue;
+                    }
+                }
+            }
+
+            /* update oldest timestamp if the entry remains alive */
+            if (ct->lastaccess < oldest_ts)
+                oldest_ts = ct->lastaccess;
+        }
+    }
+
+    cp->cc_oldest_ts = oldest_ts;
+
+    if (nremoved > 0)
+        elog(DEBUG1, "pruning catalog cache id=%d for %s: removed %d / %d",
+             cp->id, cp->cc_relname, nremoved, cp->cc_ntup + nremoved);
+
+    return nremoved > 0;
+}
+
 /*
  * Enlarge a catcache, doubling the number of buckets.
  */
@@ -1264,6 +1364,12 @@ SearchCatCacheInternal(CatCache *cache,
          */
         dlist_move_head(bucket, &ct->cache_elem);
 
+        /* prolong life of this entry */
+        if (ct->naccess < 2)
+            ct->naccess++;
+
+        ct->lastaccess = catcacheclock;
+
         /*
          * If it's a positive entry, bump its refcount and return it. If it's
          * negative, we can report failure to the caller.
@@ -1888,19 +1994,28 @@ CatalogCacheCreateEntry(CatCache *cache, HeapTuple ntp, Datum *arguments,
     ct->dead = false;
     ct->negative = negative;
     ct->hash_value = hashValue;
+    ct->naccess = 0;
+    ct->lastaccess = catcacheclock;
 
     dlist_push_head(&cache->cc_bucket[hashIndex], &ct->cache_elem);
 
     cache->cc_ntup++;
     CacheHdr->ch_ntup++;
 
+    /* increase refcount so that the new entry survives pruning */
+    ct->refcount++;
+
     /*
-     * If the hash table has become too full, enlarge the buckets array. Quite
-     * arbitrarily, we enlarge when fill factor > 2.
+     * If the hash table has become too full, try removing infrequently used
+     * entries to make a room for the new entry. If failed, enlarge the bucket
+     * array instead.  Quite arbitrarily, we try this when fill factor > 2.
      */
-    if (cache->cc_ntup > cache->cc_nbuckets * 2)
+    if (cache->cc_ntup > cache->cc_nbuckets * 2 &&
+        !CatCacheCleanupOldEntries(cache))
         RehashCatCache(cache);
 
+    ct->refcount--;
+
     return ct;
 }
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index aa564d153a..e624c74bf9 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -82,6 +82,7 @@
 #include "tsearch/ts_cache.h"
 #include "utils/builtins.h"
 #include "utils/bytea.h"
+#include "utils/catcache.h"
 #include "utils/guc_tables.h"
 #include "utils/float.h"
 #include "utils/memutils.h"
@@ -2202,6 +2203,17 @@ static struct config_int ConfigureNamesInt[] =
         NULL, NULL, NULL
     },
 
+    {
+        {"catalog_cache_prune_min_age", PGC_USERSET, RESOURCES_MEM,
+            gettext_noop("System catalog cache entries that live unused for longer than this seconds are considered
forremoval."),
 
+            gettext_noop("The value of -1 turns off pruning."),
+            GUC_UNIT_S
+        },
+        &catalog_cache_prune_min_age,
+        300, -1, INT_MAX,
+        NULL, NULL, NULL
+    },
+
     /*
      * We use the hopefully-safely-small value of 100kB as the compiled-in
      * default for max_stack_depth.  InitializeGUCOptions will increase it if
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index cccb5f145a..fa117f0573 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -128,6 +128,7 @@
 #work_mem = 4MB                # min 64kB
 #maintenance_work_mem = 64MB        # min 1MB
 #autovacuum_work_mem = -1        # min 1MB, or -1 to use maintenance_work_mem
+#catalog_cache_prune_min_age = 300s    # -1 disables pruning
 #max_stack_depth = 2MB            # min 100kB
 #shared_memory_type = mmap        # the default is the first option
                     # supported by the operating system:
diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index 65d816a583..2f697d5ca4 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -22,6 +22,7 @@
 
 #include "access/htup.h"
 #include "access/skey.h"
+#include "datatype/timestamp.h"
 #include "lib/ilist.h"
 #include "utils/relcache.h"
 
@@ -61,6 +62,7 @@ typedef struct catcache
     slist_node    cc_next;        /* list link */
     ScanKeyData cc_skey[CATCACHE_MAXKEYS];    /* precomputed key info for heap
                                              * scans */
+    long        cc_oldest_ts;
 
     /*
      * Keep these at the end, so that compiling catcache.c with CATCACHE_STATS
@@ -119,6 +121,8 @@ typedef struct catctup
     bool        dead;            /* dead but not yet removed? */
     bool        negative;        /* negative cache entry? */
     HeapTupleData tuple;        /* tuple management header */
+    int            naccess;        /* # of access to this entry, up to 2  */
+    TimestampTz    lastaccess;        /* timestamp of the last usage */
 
     /*
      * The tuple may also be a member of at most one CatCList.  (If a single
@@ -189,6 +193,19 @@ typedef struct catcacheheader
 /* this extern duplicates utils/memutils.h... */
 extern PGDLLIMPORT MemoryContext CacheMemoryContext;
 
+/* for guc.c, not PGDLLPMPORT'ed */
+extern int catalog_cache_prune_min_age;
+
+/* source clock for access timestamp of catcache entries */
+extern TimestampTz catcacheclock;
+
+/* SetCatCacheClock - set catcache timestamp source clodk */
+static inline void
+SetCatCacheClock(TimestampTz ts)
+{
+    catcacheclock = ts;
+}
+
 extern void CreateCacheMemoryContext(void);
 
 extern CatCache *InitCatCache(int id, Oid reloid, Oid indexoid,
-- 
2.16.3


Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Syntax diagrams in user documentation
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Re: Syntax diagrams in user documentation