Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Дата
Msg-id 22178.1405903964@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-07-20 19:37:15 -0400, Tom Lane wrote:
>>> I wonder whether we should change vac_update_relstats so that it only
>>> applies the "relminxid mustn't go backwards" rule as long as relminxid
>>> is sane, ie, not in the future.  If it is in the future, forcibly update
>>> it to the cutoff we actually used.  Likewise for relminmxid.  And I guess
>>> we'd need a similar rule for updating datmin(m)xid.

>> I'm wondering the same.

Here's a draft patch for this.  I think this will fix all cases where
the "1" minmxid inserted by previous pg_upgrade versions is actually
in the future at the time we run VACUUM.  We would still be at risk if
it had been in the future when pg_upgrade ran but no longer is now,
since that would mean there could be non-lock-only mxids on disk that
are older than "1".  However, for the reasons discussed upthread, it
seems fairly unlikely to me that people would actually get burnt in
practice, so I'm satisfied with doing this much and no more.

I noticed while poking at it that there was an additional small logic
error in vac_update_datfrozenxid: if we decide to update only one of
datfrozenxid and datminmxid, the value passed to vac_truncate_clog for the
other field was *not* the value stored in pg_database but the value we'd
decided not to use.  That didn't seem like a good thing; it'd at least
result in possibly using MyDatabaseId as oldestxid_datoid or
minmulti_datoid for a value that doesn't actually match our database.

Barring objections I'll commit this tomorrow.

            regards, tom lane

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 8822a15..3f72d5a 100644
*** a/src/backend/commands/vacuum.c
--- b/src/backend/commands/vacuum.c
*************** vac_update_relstats(Relation relation,
*** 733,751 ****
      }

      /*
!      * relfrozenxid should never go backward.  Caller can pass
!      * InvalidTransactionId if it has no new data.
       */
      if (TransactionIdIsNormal(frozenxid) &&
!         TransactionIdPrecedes(pgcform->relfrozenxid, frozenxid))
      {
          pgcform->relfrozenxid = frozenxid;
          dirty = true;
      }

!     /* relminmxid must never go backward, either */
      if (MultiXactIdIsValid(minmulti) &&
!         MultiXactIdPrecedes(pgcform->relminmxid, minmulti))
      {
          pgcform->relminmxid = minmulti;
          dirty = true;
--- 733,765 ----
      }

      /*
!      * Update relfrozenxid, unless caller passed InvalidTransactionId
!      * indicating it has no new data.
!      *
!      * Ordinarily, we don't let relfrozenxid go backwards: if things are
!      * working correctly, the only way the new frozenxid could be older would
!      * be if a previous VACUUM was done with a tighter freeze_min_age, in
!      * which case we don't want to forget the work it already did.  However,
!      * if the stored relfrozenxid is "in the future", then it must be corrupt
!      * and it seems best to overwrite it with the cutoff we used this time.
!      * See vac_update_datfrozenxid() concerning what we consider to be "in the
!      * future".
       */
      if (TransactionIdIsNormal(frozenxid) &&
!         pgcform->relfrozenxid != frozenxid &&
!         (TransactionIdPrecedes(pgcform->relfrozenxid, frozenxid) ||
!          TransactionIdPrecedes(GetOldestXmin(NULL, true),
!                                pgcform->relfrozenxid)))
      {
          pgcform->relfrozenxid = frozenxid;
          dirty = true;
      }

!     /* Similarly for relminmxid */
      if (MultiXactIdIsValid(minmulti) &&
!         pgcform->relminmxid != minmulti &&
!         (MultiXactIdPrecedes(pgcform->relminmxid, minmulti) ||
!          MultiXactIdPrecedes(GetOldestMultiXactId(), pgcform->relminmxid)))
      {
          pgcform->relminmxid = minmulti;
          dirty = true;
*************** vac_update_relstats(Relation relation,
*** 772,779 ****
   *        truncate pg_clog and pg_multixact.
   *
   *        We violate transaction semantics here by overwriting the database's
!  *        existing pg_database tuple with the new value.  This is reasonably
!  *        safe since the new value is correct whether or not this transaction
   *        commits.  As with vac_update_relstats, this avoids leaving dead tuples
   *        behind after a VACUUM.
   */
--- 786,793 ----
   *        truncate pg_clog and pg_multixact.
   *
   *        We violate transaction semantics here by overwriting the database's
!  *        existing pg_database tuple with the new values.  This is reasonably
!  *        safe since the new values are correct whether or not this transaction
   *        commits.  As with vac_update_relstats, this avoids leaving dead tuples
   *        behind after a VACUUM.
   */
*************** vac_update_datfrozenxid(void)
*** 786,792 ****
--- 800,808 ----
      SysScanDesc scan;
      HeapTuple    classTup;
      TransactionId newFrozenXid;
+     TransactionId lastSaneFrozenXid;
      MultiXactId newMinMulti;
+     MultiXactId lastSaneMinMulti;
      bool        dirty = false;

      /*
*************** vac_update_datfrozenxid(void)
*** 795,807 ****
       * committed pg_class entries for new tables; see AddNewRelationTuple().
       * So we cannot produce a wrong minimum by starting with this.
       */
!     newFrozenXid = GetOldestXmin(NULL, true);

      /*
       * Similarly, initialize the MultiXact "min" with the value that would be
       * used on pg_class for new tables.  See AddNewRelationTuple().
       */
!     newMinMulti = GetOldestMultiXactId();

      /*
       * We must seqscan pg_class to find the minimum Xid, because there is no
--- 811,823 ----
       * committed pg_class entries for new tables; see AddNewRelationTuple().
       * So we cannot produce a wrong minimum by starting with this.
       */
!     newFrozenXid = lastSaneFrozenXid = GetOldestXmin(NULL, true);

      /*
       * Similarly, initialize the MultiXact "min" with the value that would be
       * used on pg_class for new tables.  See AddNewRelationTuple().
       */
!     newMinMulti = lastSaneMinMulti = GetOldestMultiXactId();

      /*
       * We must seqscan pg_class to find the minimum Xid, because there is no
*************** vac_update_datfrozenxid(void)
*** 842,847 ****
--- 858,873 ----
      Assert(TransactionIdIsNormal(newFrozenXid));
      Assert(MultiXactIdIsValid(newMinMulti));

+     /*
+      * It's possible that every entry in pg_class is "in the future" (bugs in
+      * pg_upgrade have been known to produce such situations).  Don't update
+      * pg_database, nor truncate CLOG, unless we found at least one
+      * sane-looking entry.
+      */
+     if (!TransactionIdPrecedes(newFrozenXid, lastSaneFrozenXid) ||
+         !MultiXactIdPrecedes(newMinMulti, lastSaneMinMulti))
+         return;
+
      /* Now fetch the pg_database tuple we need to update. */
      relation = heap_open(DatabaseRelationId, RowExclusiveLock);

*************** vac_update_datfrozenxid(void)
*** 852,872 ****
      dbform = (Form_pg_database) GETSTRUCT(tuple);

      /*
!      * Don't allow datfrozenxid to go backward (probably can't happen anyway);
!      * and detect the common case where it doesn't go forward either.
       */
!     if (TransactionIdPrecedes(dbform->datfrozenxid, newFrozenXid))
      {
          dbform->datfrozenxid = newFrozenXid;
          dirty = true;
      }

!     /* ditto */
!     if (MultiXactIdPrecedes(dbform->datminmxid, newMinMulti))
      {
          dbform->datminmxid = newMinMulti;
          dirty = true;
      }

      if (dirty)
          heap_inplace_update(relation, tuple);
--- 878,907 ----
      dbform = (Form_pg_database) GETSTRUCT(tuple);

      /*
!      * As in vac_update_relstats(), we ordinarily don't want to let
!      * datfrozenxid go backward; but if it's "in the future" then it must be
!      * corrupt and it seems best to overwrite it.
       */
!     if (dbform->datfrozenxid != newFrozenXid &&
!         (TransactionIdPrecedes(dbform->datfrozenxid, newFrozenXid) ||
!          TransactionIdPrecedes(lastSaneFrozenXid, dbform->datfrozenxid)))
      {
          dbform->datfrozenxid = newFrozenXid;
          dirty = true;
      }
+     else
+         newFrozenXid = dbform->datfrozenxid;

!     /* Ditto for datminmxid */
!     if (dbform->datminmxid != newMinMulti &&
!         (MultiXactIdPrecedes(dbform->datminmxid, newMinMulti) ||
!          MultiXactIdPrecedes(lastSaneMinMulti, dbform->datminmxid)))
      {
          dbform->datminmxid = newMinMulti;
          dirty = true;
      }
+     else
+         newMinMulti = dbform->datminmxid;

      if (dirty)
          heap_inplace_update(relation, tuple);
*************** vac_update_datfrozenxid(void)
*** 875,883 ****
      heap_close(relation, RowExclusiveLock);

      /*
!      * If we were able to advance datfrozenxid, see if we can truncate
!      * pg_clog. Also do it if the shared XID-wrap-limit info is stale, since
!      * this action will update that too.
       */
      if (dirty || ForceTransactionIdLimitUpdate())
          vac_truncate_clog(newFrozenXid, newMinMulti);
--- 910,918 ----
      heap_close(relation, RowExclusiveLock);

      /*
!      * If we were able to advance datfrozenxid or datminmxid, see if we can
!      * truncate pg_clog and/or pg_multixact.  Also do it if the shared
!      * XID-wrap-limit info is stale, since this action will update that too.
       */
      if (dirty || ForceTransactionIdLimitUpdate())
          vac_truncate_clog(newFrozenXid, newMinMulti);
*************** vac_update_datfrozenxid(void)
*** 890,902 ****
   *        Scan pg_database to determine the system-wide oldest datfrozenxid,
   *        and use it to truncate the transaction commit log (pg_clog).
   *        Also update the XID wrap limit info maintained by varsup.c.
   *
!  *        The passed XID is simply the one I just wrote into my pg_database
!  *        entry.  It's used to initialize the "min" calculation.
   *
   *        This routine is only invoked when we've managed to change our
!  *        DB's datfrozenxid entry, or we found that the shared XID-wrap-limit
!  *        info is stale.
   */
  static void
  vac_truncate_clog(TransactionId frozenXID, MultiXactId minMulti)
--- 925,938 ----
   *        Scan pg_database to determine the system-wide oldest datfrozenxid,
   *        and use it to truncate the transaction commit log (pg_clog).
   *        Also update the XID wrap limit info maintained by varsup.c.
+  *        Likewise for datminmxid.
   *
!  *        The passed XID and minMulti are the updated values for my own
!  *        pg_database entry. They're used to initialize the "min" calculations.
   *
   *        This routine is only invoked when we've managed to change our
!  *        DB's datfrozenxid/datminmxid values, or we found that the shared
!  *        XID-wrap-limit info is stale.
   */
  static void
  vac_truncate_clog(TransactionId frozenXID, MultiXactId minMulti)
*************** vac_truncate_clog(TransactionId frozenXI
*** 909,920 ****
      Oid            minmulti_datoid;
      bool        frozenAlreadyWrapped = false;

!     /* init oldest datoids to sync with my frozen values */
      oldestxid_datoid = MyDatabaseId;
      minmulti_datoid = MyDatabaseId;

      /*
!      * Scan pg_database to compute the minimum datfrozenxid
       *
       * Note: we need not worry about a race condition with new entries being
       * inserted by CREATE DATABASE.  Any such entry will have a copy of some
--- 945,956 ----
      Oid            minmulti_datoid;
      bool        frozenAlreadyWrapped = false;

!     /* init oldest datoids to sync with my frozenXID/minMulti values */
      oldestxid_datoid = MyDatabaseId;
      minmulti_datoid = MyDatabaseId;

      /*
!      * Scan pg_database to compute the minimum datfrozenxid/datminmxid
       *
       * Note: we need not worry about a race condition with new entries being
       * inserted by CREATE DATABASE.  Any such entry will have a copy of some

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Следующее
От: Tom Lane
Дата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts