Re: A strange GiST error message or fillfactor of GiST build

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: A strange GiST error message or fillfactor of GiST build
Дата
Msg-id 20180905.172912.155622146.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: A strange GiST error message or fillfactor of GiST build  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Ответы Re: A strange GiST error message or fillfactor of GiST build  (Andrey Borodin <x4mmm@yandex-team.ru>)
Список pgsql-hackers
At Tue, 4 Sep 2018 13:05:55 +0300, Alexander Korotkov <a.korotkov@postgrespro.ru> wrote in
<CAPpHfdsJwptVsRnBtTdtj+G+gTTfSrqdH1uwLNkvQ-72A9scgw@mail.gmail.com>
> On Tue, Sep 4, 2018 at 12:16 PM Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> > I agree that fillfactor should be ignored in certain cases
> > (inserting the first tuple into empty pages or something like
> > that). Even though GiST doesn't need fillfactor at all, it is
> > defined independently from AM instances
> 
> What exactly do you mean?  Yes, fillfactor is defined in StdRdOptions
> struct, which is shared across many access methods.  But some of them
> uses fillfactor, while others don't.  For instance, GIN and BRIN don't
> support fillfactor at all.

Yes. It was with the intent of keeping Andrey's use case. So I
proposed that just disabling it rather than removing the code at
all.

> > and it gives some
> > (undocumented) convenient on testing even on GiST.
> 
> Do you keep in the mind some particular use cases?  If so, could you
> please share them.  It would let me and others understand what
> behavior we need to preserve and why.

I misunderstood the consensus here, I myself don't have a proper
one. The consensus here is:

fillfactor is useless for GiST.
We don't preserve it for Andrey's use case.
No one is objecting to ignoring it here.

Is it right?

Well, I propose the first attached instaed. It just removes
fillfactor handling from GiST.

Apart from the fillfactor removal, we have an internal error for
rootsplit failure caused by too-long tuples, but this should be a
user error message like gistSplit. (index row size %zu exeeds..)

=# create table y as  select cube(array(SELECT random() as a FROM generate_series(1,600))) from
generate_series(1,1e3,1);
 
=# create index on y using gist(cube);
ERROR:  failed to add item to index page in "y_cube_idx"

The attached second patch changes the  message in the case.

ERROR:  size of 2 index rows (9640 bytes) exceeds maximum 8152 of the root page for the index "y_cube_idx"

This is one of my first motivation here but now this might be
another issue.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
From e5abf14d5d7de4256a7d40a0e5783655a99c2515 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 5 Sep 2018 09:49:58 +0900
Subject: [PATCH 1/2] Ignore fillfactor in GiST

fillfactor for GiST doesn't work as expected since GiST doesn't
perform sorted bulk insertion to pack pages as tight as
possible. Therefore we remove the fillfactor capability from it. It is
still allowed to be set for backward compatibility but just ignored.
---
 doc/src/sgml/ref/create_index.sgml     |  8 +++++++-
 src/backend/access/common/reloptions.c |  4 ++--
 src/backend/access/gist/gist.c         | 13 ++++++-------
 src/backend/access/gist/gistbuild.c    | 18 +++---------------
 src/backend/access/gist/gistutil.c     |  5 ++---
 src/include/access/gist_private.h      | 12 +++---------
 6 files changed, 23 insertions(+), 37 deletions(-)

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 3c1223b324..e2a77e0d4d 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -390,7 +390,7 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
    </variablelist>
 
    <para>
-     The B-tree, hash, GiST and SP-GiST index methods all accept this parameter:
+     The B-tree, hash and SP-GiST index methods all accept this parameter:
    </para>
 
    <variablelist>
@@ -412,6 +412,12 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class=
       other index methods use fillfactor in different but roughly analogous
       ways; the default fillfactor varies between methods.
      </para>
+     <note>
+       <para>
+         GiST also accepts this parameter just for historical reasons but
+         ignores it.
+       </para>
+     </note>
     </listitem>
    </varlistentry>
    </variablelist>
diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index db84da0678..9c9589a78e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -186,12 +186,12 @@ static relopt_int intRelOpts[] =
     {
         {
             "fillfactor",
-            "Packs gist index pages only to this percentage",
+            "This is ignored but exists for historical reasons",
             RELOPT_KIND_GIST,
             ShareUpdateExclusiveLock    /* since it applies only to later
                                          * inserts */
         },
-        GIST_DEFAULT_FILLFACTOR, GIST_MIN_FILLFACTOR, 100
+        0, 10, 100
     },
     {
         {
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 8a42effdf7..5915ab2cf2 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -172,7 +172,7 @@ gistinsert(Relation r, Datum *values, bool *isnull,
                          values, isnull, true /* size is currently bogus */ );
     itup->t_tid = *ht_ctid;
 
-    gistdoinsert(r, itup, 0, giststate);
+    gistdoinsert(r, itup, giststate);
 
     /* cleanup */
     MemoryContextSwitchTo(oldCxt);
@@ -212,7 +212,7 @@ gistinsert(Relation r, Datum *values, bool *isnull,
  * Returns 'true' if the page was split, 'false' otherwise.
  */
 bool
-gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
+gistplacetopage(Relation rel, GISTSTATE *giststate,
                 Buffer buffer,
                 IndexTuple *itup, int ntup, OffsetNumber oldoffnum,
                 BlockNumber *newblkno,
@@ -251,7 +251,7 @@ gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
      * one-element todelete array; in the split case, it's handled implicitly
      * because the tuple vector passed to gistSplit won't include this tuple.
      */
-    is_split = gistnospace(page, itup, ntup, oldoffnum, freespace);
+    is_split = gistnospace(page, itup, ntup, oldoffnum);
 
     /*
      * If leaf page is full, try at first to delete dead tuples. And then
@@ -260,7 +260,7 @@ gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
     if (is_split && GistPageIsLeaf(page) && GistPageHasGarbage(page))
     {
         gistvacuumpage(rel, page, buffer);
-        is_split = gistnospace(page, itup, ntup, oldoffnum, freespace);
+        is_split = gistnospace(page, itup, ntup, oldoffnum);
     }
 
     if (is_split)
@@ -604,7 +604,7 @@ gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
  * so it does not bother releasing palloc'd allocations.
  */
 void
-gistdoinsert(Relation r, IndexTuple itup, Size freespace, GISTSTATE *giststate)
+gistdoinsert(Relation r, IndexTuple itup, GISTSTATE *giststate)
 {
     ItemId        iid;
     IndexTuple    idxtuple;
@@ -614,7 +614,6 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace, GISTSTATE *giststate)
     bool        xlocked = false;
 
     memset(&state, 0, sizeof(GISTInsertState));
-    state.freespace = freespace;
     state.r = r;
 
     /* Start from the root */
@@ -1226,7 +1225,7 @@ gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
     CheckForSerializableConflictIn(state->r, NULL, stack->buffer);
 
     /* Insert the tuple(s) to the page, splitting the page if necessary */
-    is_split = gistplacetopage(state->r, state->freespace, giststate,
+    is_split = gistplacetopage(state->r, giststate,
                                stack->buffer,
                                tuples, ntup,
                                oldoffnum, NULL,
diff --git a/src/backend/access/gist/gistbuild.c b/src/backend/access/gist/gistbuild.c
index 434f15f014..cd588cfceb 100644
--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -61,8 +61,6 @@ typedef struct
     int64        indtuples;        /* number of tuples indexed */
     int64        indtuplesSize;    /* total size of all indexed tuples */
 
-    Size        freespace;        /* amount of free space to leave on pages */
-
     /*
      * Extra data structures used during a buffering build. 'gfbb' contains
      * information related to managing the build buffers. 'parentMap' is a
@@ -119,7 +117,6 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
     Buffer        buffer;
     Page        page;
     MemoryContext oldcxt = CurrentMemoryContext;
-    int            fillfactor;
 
     buildstate.indexrel = index;
     if (index->rd_options)
@@ -134,8 +131,6 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
             buildstate.bufferingMode = GIST_BUFFERING_DISABLED;
         else
             buildstate.bufferingMode = GIST_BUFFERING_AUTO;
-
-        fillfactor = options->fillfactor;
     }
     else
     {
@@ -144,10 +139,7 @@ gistbuild(Relation heap, Relation index, IndexInfo *indexInfo)
          * to fit in cache.
          */
         buildstate.bufferingMode = GIST_BUFFERING_AUTO;
-        fillfactor = GIST_DEFAULT_FILLFACTOR;
     }
-    /* Calculate target amount of free space to leave on pages */
-    buildstate.freespace = BLCKSZ * (100 - fillfactor) / 100;
 
     /*
      * We expect to be called exactly once for any index relation. If that's
@@ -275,8 +267,7 @@ gistInitBuffering(GISTBuildState *buildstate)
 
     /* Calc space of index page which is available for index tuples */
     pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
-        - sizeof(ItemIdData)
-        - buildstate->freespace;
+        - sizeof(ItemIdData);
 
     /*
      * Calculate average size of already inserted index tuples using gathered
@@ -431,8 +422,7 @@ calculatePagesPerBuffer(GISTBuildState *buildstate, int levelStep)
 
     /* Calc space of index page which is available for index tuples */
     pageFreeSpace = BLCKSZ - SizeOfPageHeaderData - sizeof(GISTPageOpaqueData)
-        - sizeof(ItemIdData)
-        - buildstate->freespace;
+        - sizeof(ItemIdData);
 
     /*
      * Calculate average size of already inserted index tuples using gathered
@@ -483,8 +473,7 @@ gistBuildCallback(Relation index,
          * There's no buffers (yet). Since we already have the index relation
          * locked, we call gistdoinsert directly.
          */
-        gistdoinsert(index, itup, buildstate->freespace,
-                     buildstate->giststate);
+        gistdoinsert(index, itup, buildstate->giststate);
     }
 
     /* Update tuple count and total size. */
@@ -684,7 +673,6 @@ gistbufferinginserttuples(GISTBuildState *buildstate, Buffer buffer, int level,
     BlockNumber placed_to_blk = InvalidBlockNumber;
 
     is_split = gistplacetopage(buildstate->indexrel,
-                               buildstate->freespace,
                                buildstate->giststate,
                                buffer,
                                itup, ntup, oldoffnum, &placed_to_blk,
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index dddfe0ae2c..89aad95b20 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -54,9 +54,9 @@ gistfillbuffer(Page page, IndexTuple *itup, int len, OffsetNumber off)
  * Check space for itup vector on page
  */
 bool
-gistnospace(Page page, IndexTuple *itvec, int len, OffsetNumber todelete, Size freespace)
+gistnospace(Page page, IndexTuple *itvec, int len, OffsetNumber todelete)
 {
-    unsigned int size = freespace,
+    unsigned int size = 0,
                 deleted = 0;
     int            i;
 
@@ -82,7 +82,6 @@ gistfitpage(IndexTuple *itvec, int len)
     for (i = 0; i < len; i++)
         size += IndexTupleSize(itvec[i]) + sizeof(ItemIdData);
 
-    /* TODO: Consider fillfactor */
     return (size <= GiSTPageSize);
 }
 
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 36ed7244ba..dd8845b6be 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -240,8 +240,6 @@ typedef struct GistSplitVector
 typedef struct
 {
     Relation    r;
-    Size        freespace;        /* free space to be left */
-
     GISTInsertStack *stack;
 } GISTInsertState;
 
@@ -373,7 +371,7 @@ typedef struct GISTBuildBuffers
 typedef struct GiSTOptions
 {
     int32        vl_len_;        /* varlena header (do not touch directly!) */
-    int            fillfactor;        /* page fill factor in percent (0..100) */
+    int            fillfactor;        /* page fill factor: JUST IGNORED  */
     int            bufferingModeOffset;    /* use buffering build? */
 } GiSTOptions;
 
@@ -388,7 +386,6 @@ extern GISTSTATE *initGISTstate(Relation index);
 extern void freeGISTstate(GISTSTATE *giststate);
 extern void gistdoinsert(Relation r,
              IndexTuple itup,
-             Size freespace,
              GISTSTATE *GISTstate);
 
 /* A List of these is returned from gistplacetopage() in *splitinfo */
@@ -398,7 +395,7 @@ typedef struct
     IndexTuple    downlink;        /* downlink for this half. */
 } GISTPageSplitInfo;
 
-extern bool gistplacetopage(Relation rel, Size freespace, GISTSTATE *giststate,
+extern bool gistplacetopage(Relation rel, GISTSTATE *giststate,
                 Buffer buffer,
                 IndexTuple *itup, int ntup,
                 OffsetNumber oldoffnum, BlockNumber *newblkno,
@@ -432,15 +429,12 @@ extern bool gistvalidate(Oid opclassoid);
 #define GiSTPageSize   \
     ( BLCKSZ - SizeOfPageHeaderData - MAXALIGN(sizeof(GISTPageOpaqueData)) )
 
-#define GIST_MIN_FILLFACTOR            10
-#define GIST_DEFAULT_FILLFACTOR        90
-
 extern bytea *gistoptions(Datum reloptions, bool validate);
 extern bool gistproperty(Oid index_oid, int attno,
              IndexAMProperty prop, const char *propname,
              bool *res, bool *isnull);
 extern bool gistfitpage(IndexTuple *itvec, int len);
-extern bool gistnospace(Page page, IndexTuple *itvec, int len, OffsetNumber todelete, Size freespace);
+extern bool gistnospace(Page page, IndexTuple *itvec, int len, OffsetNumber todelete);
 extern void gistcheckpage(Relation rel, Buffer buf);
 extern Buffer gistNewBuffer(Relation r);
 extern void gistfillbuffer(Page page, IndexTuple *itup, int len,
-- 
2.16.3

From e1418b6655386eb34574c4f930d8dc66f16e69a1 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyotaro@lab.ntt.co.jp>
Date: Wed, 5 Sep 2018 12:15:13 +0900
Subject: [PATCH 2/2] More friendly error message for rootsplit failure of GiST

When tuple insertion failed during page split of GiST index, we have
an internal error that complains that just "failed to add item to
index page". However, things are a bit different in the rootsplit
case. It is rather a user side error that the user provided too long
data for the index so emit a more friendly error message in the case
like gistSplit does.
---
 src/backend/access/gist/gist.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 5915ab2cf2..4590e88a77 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -412,7 +412,34 @@ gistplacetopage(Relation rel, GISTSTATE *giststate,
                 IndexTuple    thistup = (IndexTuple) data;
 
                 if (PageAddItem(ptr->page, (Item) data, IndexTupleSize(thistup), i + FirstOffsetNumber, false, false)
==InvalidOffsetNumber)
 
+                {
+                    /*
+                     * Emit user error message for root split failure since
+                     * this is rather a user error than internal one. We could
+                     * this earlier before preparing new buffers, but we don't
+                     * bother prechecking for such a corner case.
+                     */
+                    if (is_rootsplit && ptr == dist)
+                    {
+                        size_t    total_tupsize;
+
+                        /* count size of all tuples to be inserted */
+                        for (; i < ptr->block.num ; i++)
+                            data += IndexTupleSize((IndexTuple) data);
+
+                        /*  total_tupsize is including item id */
+                        total_tupsize =
+                            (char *)data - (char *) (ptr->list)
+                            + sizeof(ItemIdData) * ptr->block.num;
+                        ereport(ERROR,
+                                (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+                                 (errmsg("size of %d index rows (%zu bytes) exceeds maximum %zu of the root page for
theindex \"%s\"",
 
+                                         ptr->block.num, total_tupsize,
+                                         GiSTPageSize,
+                                         RelationGetRelationName(rel)))));
+                    }
                     elog(ERROR, "failed to add item to index page in \"%s\"", RelationGetRelationName(rel));
+                }
 
                 /*
                  * If this is the first inserted/updated tuple, let the caller
-- 
2.16.3


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Bug fix for glibc broke freebsd build in REL_11_STABLE
Следующее
От: Andrey Borodin
Дата:
Сообщение: Re: A strange GiST error message or fillfactor of GiST build