Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6
Дата
Msg-id 2157.1556816395@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: REINDEX INDEX results in a crash for an index of pg_class since 9.6  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: REINDEX INDEX results in a crash for an index of pg_class since9.6  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
I wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2019-05-02 11:41:28 -0400, Tom Lane wrote:
>>> But don't we need to worry about resetting relfrozenxid?

>> Indexes don't have that though? We couldn't do it for pg_class itself,
>> but that's not a problem here.

> Hmm.  Again, that seems like the sort of assumption that could bite
> us later.  But maybe we could add some assertions that the new values
> match the old?  I'll experiment.

Huh, this actually seems to work.  The attached is a quick hack for
testing.  It gets through check-world straight up and with
xxx_FORCE_RELEASE, and I've verified reindexing pg_class works with
CLOBBER_CACHE_ALWAYS, but it'll be a few hours before I have a full CCA
run.

One interesting thing that turns up in check-world is that if wal_level
is minimal, we have to manually force an XID to be assigned, else
reindexing pg_class fails with "cannot commit a transaction that deleted
files but has no xid" :-(.  Perhaps there's some other cleaner place to
do that?

If we go this path, we should remove RelationSetIndexList altogether
(in HEAD), but I've not done so here.  The comments likely need more
work too.

I have to go out and do some errands for the next few hours, so I can't
push this forward any more right now.

            regards, tom lane

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index ce02410..508a04e 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -3330,20 +3330,15 @@ reindex_index(Oid indexId, bool skip_constraint_checks, char persistence,
         indexInfo->ii_ExclusionStrats = NULL;
     }

-    /*
-     * Build a new physical relation for the index. Need to do that before
-     * "officially" starting the reindexing with SetReindexProcessing -
-     * otherwise the necessary pg_class changes cannot be made with
-     * encountering assertions.
-     */
-    RelationSetNewRelfilenode(iRel, persistence);
-
     /* ensure SetReindexProcessing state isn't leaked */
     PG_TRY();
     {
         /* Suppress use of the target index while rebuilding it */
         SetReindexProcessing(heapId, indexId);

+        /* Build a new physical relation for the index */
+        RelationSetNewRelfilenode(iRel, persistence);
+
         /* Initialize the index and rebuild */
         /* Note: we do not need to re-establish pkey setting */
         index_build(heapRelation, iRel, indexInfo, true, true);
@@ -3491,7 +3486,6 @@ reindex_relation(Oid relid, int flags, int options)
     Relation    rel;
     Oid            toast_relid;
     List       *indexIds;
-    bool        is_pg_class;
     bool        result;
     int            i;

@@ -3527,32 +3521,8 @@ reindex_relation(Oid relid, int flags, int options)
      */
     indexIds = RelationGetIndexList(rel);

-    /*
-     * reindex_index will attempt to update the pg_class rows for the relation
-     * and index.  If we are processing pg_class itself, we want to make sure
-     * that the updates do not try to insert index entries into indexes we
-     * have not processed yet.  (When we are trying to recover from corrupted
-     * indexes, that could easily cause a crash.) We can accomplish this
-     * because CatalogTupleInsert/CatalogTupleUpdate will use the relcache's
-     * index list to know which indexes to update. We just force the index
-     * list to be only the stuff we've processed.
-     *
-     * It is okay to not insert entries into the indexes we have not processed
-     * yet because all of this is transaction-safe.  If we fail partway
-     * through, the updated rows are dead and it doesn't matter whether they
-     * have index entries.  Also, a new pg_class index will be created with a
-     * correct entry for its own pg_class row because we do
-     * RelationSetNewRelfilenode() before we do index_build().
-     */
-    is_pg_class = (RelationGetRelid(rel) == RelationRelationId);
-
-    /* Ensure rd_indexattr is valid; see comments for RelationSetIndexList */
-    if (is_pg_class)
-        (void) RelationGetIndexAttrBitmap(rel, INDEX_ATTR_BITMAP_ALL);
-
     PG_TRY();
     {
-        List       *doneIndexes;
         ListCell   *indexId;
         char        persistence;

@@ -3580,15 +3550,11 @@ reindex_relation(Oid relid, int flags, int options)
             persistence = rel->rd_rel->relpersistence;

         /* Reindex all the indexes. */
-        doneIndexes = NIL;
         i = 1;
         foreach(indexId, indexIds)
         {
             Oid            indexOid = lfirst_oid(indexId);

-            if (is_pg_class)
-                RelationSetIndexList(rel, doneIndexes);
-
             reindex_index(indexOid, !(flags & REINDEX_REL_CHECK_CONSTRAINTS),
                           persistence, options);

@@ -3597,9 +3563,6 @@ reindex_relation(Oid relid, int flags, int options)
             /* Index should no longer be in the pending list */
             Assert(!ReindexIsProcessingIndex(indexOid));

-            if (is_pg_class)
-                doneIndexes = lappend_oid(doneIndexes, indexOid);
-
             /* Set index rebuild count */
             pgstat_progress_update_param(PROGRESS_CLUSTER_INDEX_REBUILD_COUNT,
                                          i);
@@ -3615,9 +3578,6 @@ reindex_relation(Oid relid, int flags, int options)
     PG_END_TRY();
     ResetReindexPending();

-    if (is_pg_class)
-        RelationSetIndexList(rel, indexIds);
-
     /*
      * Close rel, but continue to hold the lock.
      */
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 90ff8cc..b51a53d 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -3463,9 +3463,6 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
      */
     RelationDropStorage(relation);

-    /* initialize new relfilenode from old relfilenode */
-    newrnode = relation->rd_node;
-
     /*
      * Create storage for the main fork of the new relfilenode. If it's
      * table-like object, call into table AM to do so, which'll also create
@@ -3479,15 +3476,6 @@ RelationSetNewRelfilenode(Relation relation, char persistence)

     switch (relation->rd_rel->relkind)
     {
-            /* shouldn't be called for these */
-        case RELKIND_VIEW:
-        case RELKIND_COMPOSITE_TYPE:
-        case RELKIND_FOREIGN_TABLE:
-        case RELKIND_PARTITIONED_TABLE:
-        case RELKIND_PARTITIONED_INDEX:
-            elog(ERROR, "should not have storage");
-            break;
-
         case RELKIND_INDEX:
         case RELKIND_SEQUENCE:
             {
@@ -3505,41 +3493,71 @@ RelationSetNewRelfilenode(Relation relation, char persistence)
                                             persistence,
                                             &freezeXid, &minmulti);
             break;
+
+            /* shouldn't be called for anything else */
+        default:
+            elog(ERROR, "should not have storage");
+            break;
     }

     /*
-     * However, if we're dealing with a mapped index, pg_class.relfilenode
-     * doesn't change; instead we have to send the update to the relation
-     * mapper.
+     * If we're dealing with a mapped index, pg_class.relfilenode doesn't
+     * change; instead we have to send the update to the relation mapper.
+     *
+     * For mapped indexes, we don't actually change the pg_class entry at all;
+     * this is essential when reindexing pg_class itself.  That leaves us with
+     * possibly-inaccurate values of relpages etc, but those will be fixed up
+     * later.
      */
     if (RelationIsMapped(relation))
+    {
+        /* Since we're not updating pg_class, these had better not change */
+        Assert(classform->relfrozenxid == freezeXid);
+        Assert(classform->relminmxid == minmulti);
+        Assert(classform->relpersistence == persistence);
+
+        /*
+         * In some code paths it's possible that the tuple update we'd
+         * otherwise do here is the only thing that would assign an XID for
+         * the current transaction.  However, we must have an XID to delete
+         * files, so make sure one is assigned.
+         */
+        (void) GetCurrentTransactionId();
+
+        /* Do the deed */
         RelationMapUpdateMap(RelationGetRelid(relation),
                              newrelfilenode,
                              relation->rd_rel->relisshared,
                              false);
+
+        /* Since we're not updating pg_class, must trigger inval manually */
+        CacheInvalidateRelcache(relation);
+    }
     else
+    {
         classform->relfilenode = newrelfilenode;

-    /* These changes are safe even for a mapped relation */
-    if (relation->rd_rel->relkind != RELKIND_SEQUENCE)
-    {
-        classform->relpages = 0;    /* it's empty until further notice */
-        classform->reltuples = 0;
-        classform->relallvisible = 0;
-    }
-    classform->relfrozenxid = freezeXid;
-    classform->relminmxid = minmulti;
-    classform->relpersistence = persistence;
+        /* These changes are safe even for a mapped relation */
+        if (relation->rd_rel->relkind != RELKIND_SEQUENCE)
+        {
+            classform->relpages = 0;    /* it's empty until further notice */
+            classform->reltuples = 0;
+            classform->relallvisible = 0;
+        }
+        classform->relfrozenxid = freezeXid;
+        classform->relminmxid = minmulti;
+        classform->relpersistence = persistence;

-    CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
+        CatalogTupleUpdate(pg_class, &tuple->t_self, tuple);
+    }

     heap_freetuple(tuple);

     table_close(pg_class, RowExclusiveLock);

     /*
-     * Make the pg_class row change visible, as well as the relation map
-     * change if any.  This will cause the relcache entry to get updated, too.
+     * Make the pg_class row change or relation map change visible.  This will
+     * cause the relcache entry to get updated, too.
      */
     CommandCounterIncrement();


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: pgbench - add option to show actual builtin script code
Следующее
От: Andres Freund
Дата:
Сообщение: Re: REINDEX INDEX results in a crash for an index of pg_class since9.6