Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY
Дата
Msg-id 20579.1486325699@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Ответы Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY  (Andres Freund <andres@anarazel.de>)
Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Re: [HACKERS] Index corruption with CREATE INDEX CONCURRENTLY  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-hackers
[ Having now read the whole thread, I'm prepared to weigh in ... ]

Pavan Deolasee <pavan.deolasee@gmail.com> writes:
> This seems like a real problem to me. I don't know what the consequences
> are, but definitely having various attribute lists to have different view
> of the set of indexes doesn't seem right.

For sure.

> 2. In the second patch, we tried to recompute attribute lists if a relcache
> flush happens in between and index list is invalidated. We've seen problems
> with that, especially it getting into an infinite loop with
> CACHE_CLOBBER_ALWAYS. Not clear why it happens, but apparently relcache
> flushes keep happening.

Well, yeah: the point of CLOBBER_CACHE_ALWAYS is to make a relcache flush
happen whenever it possibly could.  The way to deal with that without
looping is to test whether the index set *actually* changed, not whether
it just might have changed.

I do not like any of the other patches proposed in this thread, because
they fail to guarantee delivering an up-to-date attribute bitmap to the
caller.  I think we need a retry loop, and I think that it needs to look
like the attached.

(Note that the tests whether rd_pkindex and rd_replidindex changed might
be redundant; but I'm not totally sure of that, and they're cheap.)

            regards, tom lane

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 8a7c560..b659994 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*************** RelationGetFKeyList(Relation relation)
*** 4327,4335 ****
   * it is the pg_class OID of a unique index on OID when the relation has one,
   * and InvalidOid if there is no such index.
   *
!  * In exactly the same way, we update rd_replidindex, which is the pg_class
!  * OID of an index to be used as the relation's replication identity index,
!  * or InvalidOid if there is no such index.
   */
  List *
  RelationGetIndexList(Relation relation)
--- 4327,4336 ----
   * it is the pg_class OID of a unique index on OID when the relation has one,
   * and InvalidOid if there is no such index.
   *
!  * In exactly the same way, we update rd_pkindex, which is the OID of the
!  * relation's primary key index if any, else InvalidOid; and rd_replidindex,
!  * which is the pg_class OID of an index to be used as the relation's
!  * replication identity index, or InvalidOid if there is no such index.
   */
  List *
  RelationGetIndexList(Relation relation)
*************** RelationGetIndexPredicate(Relation relat
*** 4746,4753 ****
   * we can include system attributes (e.g., OID) in the bitmap representation.
   *
   * Caller had better hold at least RowExclusiveLock on the target relation
!  * to ensure that it has a stable set of indexes.  This also makes it safe
!  * (deadlock-free) for us to take locks on the relation's indexes.
   *
   * The returned result is palloc'd in the caller's memory context and should
   * be bms_free'd when not needed anymore.
--- 4747,4756 ----
   * we can include system attributes (e.g., OID) in the bitmap representation.
   *
   * Caller had better hold at least RowExclusiveLock on the target relation
!  * to ensure it is safe (deadlock-free) for us to take locks on the relation's
!  * indexes.  Note that since the introduction of CREATE INDEX CONCURRENTLY,
!  * that lock level doesn't guarantee a stable set of indexes, so we have to
!  * be prepared to retry here in case of a change in the set of indexes.
   *
   * The returned result is palloc'd in the caller's memory context and should
   * be bms_free'd when not needed anymore.
*************** RelationGetIndexAttrBitmap(Relation rela
*** 4760,4765 ****
--- 4763,4769 ----
      Bitmapset  *pkindexattrs;    /* columns in the primary index */
      Bitmapset  *idindexattrs;    /* columns in the replica identity */
      List       *indexoidlist;
+     List       *newindexoidlist;
      Oid            relpkindex;
      Oid            relreplindex;
      ListCell   *l;
*************** RelationGetIndexAttrBitmap(Relation rela
*** 4788,4795 ****
          return NULL;

      /*
!      * Get cached list of index OIDs
       */
      indexoidlist = RelationGetIndexList(relation);

      /* Fall out if no indexes (but relhasindex was set) */
--- 4792,4800 ----
          return NULL;

      /*
!      * Get cached list of index OIDs. If we have to start over, we do so here.
       */
+ restart:
      indexoidlist = RelationGetIndexList(relation);

      /* Fall out if no indexes (but relhasindex was set) */
*************** RelationGetIndexAttrBitmap(Relation rela
*** 4800,4808 ****
       * Copy the rd_pkindex and rd_replidindex value computed by
       * RelationGetIndexList before proceeding.  This is needed because a
       * relcache flush could occur inside index_open below, resetting the
!      * fields managed by RelationGetIndexList. (The values we're computing
!      * will still be valid, assuming that caller has a sufficient lock on
!      * the relation.)
       */
      relpkindex = relation->rd_pkindex;
      relreplindex = relation->rd_replidindex;
--- 4805,4812 ----
       * Copy the rd_pkindex and rd_replidindex value computed by
       * RelationGetIndexList before proceeding.  This is needed because a
       * relcache flush could occur inside index_open below, resetting the
!      * fields managed by RelationGetIndexList.  We need to do the work with
!      * stable values of these fields.
       */
      relpkindex = relation->rd_pkindex;
      relreplindex = relation->rd_replidindex;
*************** RelationGetIndexAttrBitmap(Relation rela
*** 4880,4886 ****
          index_close(indexDesc, AccessShareLock);
      }

!     list_free(indexoidlist);

      /* Don't leak the old values of these bitmaps, if any */
      bms_free(relation->rd_indexattr);
--- 4884,4915 ----
          index_close(indexDesc, AccessShareLock);
      }

!     /*
!      * During one of the index_opens in the above loop, we might have received
!      * a relcache flush event on this relcache entry, which might have been
!      * signaling a change in the rel's index list.  If so, we'd better start
!      * over to ensure we deliver up-to-date attribute bitmaps.
!      */
!     newindexoidlist = RelationGetIndexList(relation);
!     if (equal(indexoidlist, newindexoidlist) &&
!         relpkindex == relation->rd_pkindex &&
!         relreplindex == relation->rd_replidindex)
!     {
!         /* Still the same index set, so proceed */
!         list_free(newindexoidlist);
!         list_free(indexoidlist);
!     }
!     else
!     {
!         /* Gotta do it over ... might as well not leak memory */
!         list_free(newindexoidlist);
!         list_free(indexoidlist);
!         bms_free(uindexattrs);
!         bms_free(pkindexattrs);
!         bms_free(idindexattrs);
!         bms_free(indexattrs);
!         goto restart;
!     }

      /* Don't leak the old values of these bitmaps, if any */
      bms_free(relation->rd_indexattr);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: [HACKERS] Ignore tablespace ACLs when ignoring schema ACLs
Следующее
От: Brandur Leach
Дата:
Сообщение: Re: [HACKERS] [PATCH] SortSupport for macaddr type