Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Дата
Msg-id 26504.1405912770@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Andres Freund <andres@2ndquadrant.com>)
Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-bugs
I wrote:
> Here's a draft patch for this.  I think this will fix all cases where
> the "1" minmxid inserted by previous pg_upgrade versions is actually
> in the future at the time we run VACUUM.  We would still be at risk if
> it had been in the future when pg_upgrade ran but no longer is now,
> since that would mean there could be non-lock-only mxids on disk that
> are older than "1".  However, for the reasons discussed upthread, it
> seems fairly unlikely to me that people would actually get burnt in
> practice, so I'm satisfied with doing this much and no more.

Ah, belay that: as coded, that would allow truncation of clog/multixact
as soon as any one relation in any one database had sane
frozenxid/minmxid.  If we want to have any pretense of being safe, we have
to postpone truncation until *all* relations have been vacuumed.  So more
like the attached, instead.

            regards, tom lane

diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 8822a15..ec9a7b6 100644
*** a/src/backend/commands/vacuum.c
--- b/src/backend/commands/vacuum.c
*************** static BufferAccessStrategy vac_strategy
*** 66,72 ****

  /* non-export function prototypes */
  static List *get_rel_oids(Oid relid, const RangeVar *vacrel);
! static void vac_truncate_clog(TransactionId frozenXID, MultiXactId minMulti);
  static bool vacuum_rel(Oid relid, VacuumStmt *vacstmt, bool do_toast,
             bool for_wraparound);

--- 66,75 ----

  /* non-export function prototypes */
  static List *get_rel_oids(Oid relid, const RangeVar *vacrel);
! static void vac_truncate_clog(TransactionId frozenXID,
!                   MultiXactId minMulti,
!                   TransactionId lastSaneFrozenXid,
!                   MultiXactId lastSaneMinMulti);
  static bool vacuum_rel(Oid relid, VacuumStmt *vacstmt, bool do_toast,
             bool for_wraparound);

*************** vac_update_relstats(Relation relation,
*** 733,751 ****
      }

      /*
!      * relfrozenxid should never go backward.  Caller can pass
!      * InvalidTransactionId if it has no new data.
       */
      if (TransactionIdIsNormal(frozenxid) &&
!         TransactionIdPrecedes(pgcform->relfrozenxid, frozenxid))
      {
          pgcform->relfrozenxid = frozenxid;
          dirty = true;
      }

!     /* relminmxid must never go backward, either */
      if (MultiXactIdIsValid(minmulti) &&
!         MultiXactIdPrecedes(pgcform->relminmxid, minmulti))
      {
          pgcform->relminmxid = minmulti;
          dirty = true;
--- 736,768 ----
      }

      /*
!      * Update relfrozenxid, unless caller passed InvalidTransactionId
!      * indicating it has no new data.
!      *
!      * Ordinarily, we don't let relfrozenxid go backwards: if things are
!      * working correctly, the only way the new frozenxid could be older would
!      * be if a previous VACUUM was done with a tighter freeze_min_age, in
!      * which case we don't want to forget the work it already did.  However,
!      * if the stored relfrozenxid is "in the future", then it must be corrupt
!      * and it seems best to overwrite it with the cutoff we used this time.
!      * See vac_update_datfrozenxid() concerning what we consider to be "in the
!      * future".
       */
      if (TransactionIdIsNormal(frozenxid) &&
!         pgcform->relfrozenxid != frozenxid &&
!         (TransactionIdPrecedes(pgcform->relfrozenxid, frozenxid) ||
!          TransactionIdPrecedes(GetOldestXmin(NULL, true),
!                                pgcform->relfrozenxid)))
      {
          pgcform->relfrozenxid = frozenxid;
          dirty = true;
      }

!     /* Similarly for relminmxid */
      if (MultiXactIdIsValid(minmulti) &&
!         pgcform->relminmxid != minmulti &&
!         (MultiXactIdPrecedes(pgcform->relminmxid, minmulti) ||
!          MultiXactIdPrecedes(GetOldestMultiXactId(), pgcform->relminmxid)))
      {
          pgcform->relminmxid = minmulti;
          dirty = true;
*************** vac_update_relstats(Relation relation,
*** 772,779 ****
   *        truncate pg_clog and pg_multixact.
   *
   *        We violate transaction semantics here by overwriting the database's
!  *        existing pg_database tuple with the new value.  This is reasonably
!  *        safe since the new value is correct whether or not this transaction
   *        commits.  As with vac_update_relstats, this avoids leaving dead tuples
   *        behind after a VACUUM.
   */
--- 789,796 ----
   *        truncate pg_clog and pg_multixact.
   *
   *        We violate transaction semantics here by overwriting the database's
!  *        existing pg_database tuple with the new values.  This is reasonably
!  *        safe since the new values are correct whether or not this transaction
   *        commits.  As with vac_update_relstats, this avoids leaving dead tuples
   *        behind after a VACUUM.
   */
*************** vac_update_datfrozenxid(void)
*** 786,792 ****
--- 803,812 ----
      SysScanDesc scan;
      HeapTuple    classTup;
      TransactionId newFrozenXid;
+     TransactionId lastSaneFrozenXid;
      MultiXactId newMinMulti;
+     MultiXactId lastSaneMinMulti;
+     bool        bogus = false;
      bool        dirty = false;

      /*
*************** vac_update_datfrozenxid(void)
*** 795,807 ****
       * committed pg_class entries for new tables; see AddNewRelationTuple().
       * So we cannot produce a wrong minimum by starting with this.
       */
!     newFrozenXid = GetOldestXmin(NULL, true);

      /*
       * Similarly, initialize the MultiXact "min" with the value that would be
       * used on pg_class for new tables.  See AddNewRelationTuple().
       */
!     newMinMulti = GetOldestMultiXactId();

      /*
       * We must seqscan pg_class to find the minimum Xid, because there is no
--- 815,827 ----
       * committed pg_class entries for new tables; see AddNewRelationTuple().
       * So we cannot produce a wrong minimum by starting with this.
       */
!     newFrozenXid = lastSaneFrozenXid = GetOldestXmin(NULL, true);

      /*
       * Similarly, initialize the MultiXact "min" with the value that would be
       * used on pg_class for new tables.  See AddNewRelationTuple().
       */
!     newMinMulti = lastSaneMinMulti = GetOldestMultiXactId();

      /*
       * We must seqscan pg_class to find the minimum Xid, because there is no
*************** vac_update_datfrozenxid(void)
*** 828,833 ****
--- 848,868 ----
          Assert(TransactionIdIsNormal(classForm->relfrozenxid));
          Assert(MultiXactIdIsValid(classForm->relminmxid));

+         /*
+          * If things are working properly, no relation should have a
+          * relfrozenxid or relminmxid that is "in the future".  However, such
+          * cases have been known to arise due to bugs in pg_upgrade.  If we
+          * see any entries that are "in the future", chicken out and don't do
+          * anything.  This ensures we won't truncate clog before those
+          * relations have been scanned and cleaned up.
+          */
+         if (TransactionIdPrecedes(lastSaneFrozenXid, classForm->relfrozenxid) ||
+             MultiXactIdPrecedes(lastSaneMinMulti, classForm->relminmxid))
+         {
+             bogus = true;
+             break;
+         }
+
          if (TransactionIdPrecedes(classForm->relfrozenxid, newFrozenXid))
              newFrozenXid = classForm->relfrozenxid;

*************** vac_update_datfrozenxid(void)
*** 839,844 ****
--- 874,883 ----
      systable_endscan(scan);
      heap_close(relation, AccessShareLock);

+     /* chicken out if bogus data found */
+     if (bogus)
+         return;
+
      Assert(TransactionIdIsNormal(newFrozenXid));
      Assert(MultiXactIdIsValid(newMinMulti));

*************** vac_update_datfrozenxid(void)
*** 852,872 ****
      dbform = (Form_pg_database) GETSTRUCT(tuple);

      /*
!      * Don't allow datfrozenxid to go backward (probably can't happen anyway);
!      * and detect the common case where it doesn't go forward either.
       */
!     if (TransactionIdPrecedes(dbform->datfrozenxid, newFrozenXid))
      {
          dbform->datfrozenxid = newFrozenXid;
          dirty = true;
      }

!     /* ditto */
!     if (MultiXactIdPrecedes(dbform->datminmxid, newMinMulti))
      {
          dbform->datminmxid = newMinMulti;
          dirty = true;
      }

      if (dirty)
          heap_inplace_update(relation, tuple);
--- 891,920 ----
      dbform = (Form_pg_database) GETSTRUCT(tuple);

      /*
!      * As in vac_update_relstats(), we ordinarily don't want to let
!      * datfrozenxid go backward; but if it's "in the future" then it must be
!      * corrupt and it seems best to overwrite it.
       */
!     if (dbform->datfrozenxid != newFrozenXid &&
!         (TransactionIdPrecedes(dbform->datfrozenxid, newFrozenXid) ||
!          TransactionIdPrecedes(lastSaneFrozenXid, dbform->datfrozenxid)))
      {
          dbform->datfrozenxid = newFrozenXid;
          dirty = true;
      }
+     else
+         newFrozenXid = dbform->datfrozenxid;

!     /* Ditto for datminmxid */
!     if (dbform->datminmxid != newMinMulti &&
!         (MultiXactIdPrecedes(dbform->datminmxid, newMinMulti) ||
!          MultiXactIdPrecedes(lastSaneMinMulti, dbform->datminmxid)))
      {
          dbform->datminmxid = newMinMulti;
          dirty = true;
      }
+     else
+         newMinMulti = dbform->datminmxid;

      if (dirty)
          heap_inplace_update(relation, tuple);
*************** vac_update_datfrozenxid(void)
*** 875,886 ****
      heap_close(relation, RowExclusiveLock);

      /*
!      * If we were able to advance datfrozenxid, see if we can truncate
!      * pg_clog. Also do it if the shared XID-wrap-limit info is stale, since
!      * this action will update that too.
       */
      if (dirty || ForceTransactionIdLimitUpdate())
!         vac_truncate_clog(newFrozenXid, newMinMulti);
  }


--- 923,935 ----
      heap_close(relation, RowExclusiveLock);

      /*
!      * If we were able to advance datfrozenxid or datminmxid, see if we can
!      * truncate pg_clog and/or pg_multixact.  Also do it if the shared
!      * XID-wrap-limit info is stale, since this action will update that too.
       */
      if (dirty || ForceTransactionIdLimitUpdate())
!         vac_truncate_clog(newFrozenXid, newMinMulti,
!                           lastSaneFrozenXid, lastSaneMinMulti);
  }


*************** vac_update_datfrozenxid(void)
*** 890,905 ****
   *        Scan pg_database to determine the system-wide oldest datfrozenxid,
   *        and use it to truncate the transaction commit log (pg_clog).
   *        Also update the XID wrap limit info maintained by varsup.c.
   *
!  *        The passed XID is simply the one I just wrote into my pg_database
!  *        entry.  It's used to initialize the "min" calculation.
   *
   *        This routine is only invoked when we've managed to change our
!  *        DB's datfrozenxid entry, or we found that the shared XID-wrap-limit
!  *        info is stale.
   */
  static void
! vac_truncate_clog(TransactionId frozenXID, MultiXactId minMulti)
  {
      TransactionId myXID = GetCurrentTransactionId();
      Relation    relation;
--- 939,960 ----
   *        Scan pg_database to determine the system-wide oldest datfrozenxid,
   *        and use it to truncate the transaction commit log (pg_clog).
   *        Also update the XID wrap limit info maintained by varsup.c.
+  *        Likewise for datminmxid.
   *
!  *        The passed frozenXID and minMulti are the updated values for my own
!  *        pg_database entry. They're used to initialize the "min" calculations.
!  *        The caller also passes the "last sane" XID and MXID, since it has
!  *        those at hand already.
   *
   *        This routine is only invoked when we've managed to change our
!  *        DB's datfrozenxid/datminmxid values, or we found that the shared
!  *        XID-wrap-limit info is stale.
   */
  static void
! vac_truncate_clog(TransactionId frozenXID,
!                   MultiXactId minMulti,
!                   TransactionId lastSaneFrozenXid,
!                   MultiXactId lastSaneMinMulti)
  {
      TransactionId myXID = GetCurrentTransactionId();
      Relation    relation;
*************** vac_truncate_clog(TransactionId frozenXI
*** 907,920 ****
      HeapTuple    tuple;
      Oid            oldestxid_datoid;
      Oid            minmulti_datoid;
      bool        frozenAlreadyWrapped = false;

!     /* init oldest datoids to sync with my frozen values */
      oldestxid_datoid = MyDatabaseId;
      minmulti_datoid = MyDatabaseId;

      /*
!      * Scan pg_database to compute the minimum datfrozenxid
       *
       * Note: we need not worry about a race condition with new entries being
       * inserted by CREATE DATABASE.  Any such entry will have a copy of some
--- 962,976 ----
      HeapTuple    tuple;
      Oid            oldestxid_datoid;
      Oid            minmulti_datoid;
+     bool        bogus = false;
      bool        frozenAlreadyWrapped = false;

!     /* init oldest datoids to sync with my frozenXID/minMulti values */
      oldestxid_datoid = MyDatabaseId;
      minmulti_datoid = MyDatabaseId;

      /*
!      * Scan pg_database to compute the minimum datfrozenxid/datminmxid
       *
       * Note: we need not worry about a race condition with new entries being
       * inserted by CREATE DATABASE.  Any such entry will have a copy of some
*************** vac_truncate_clog(TransactionId frozenXI
*** 936,941 ****
--- 992,1010 ----
          Assert(TransactionIdIsNormal(dbform->datfrozenxid));
          Assert(MultiXactIdIsValid(dbform->datminmxid));

+         /*
+          * If things are working properly, no database should have a
+          * datfrozenxid or datminmxid that is "in the future".  However, such
+          * cases have been known to arise due to bugs in pg_upgrade.  If we
+          * see any entries that are "in the future", chicken out and don't do
+          * anything.  This ensures we won't truncate clog before those
+          * databases have been scanned and cleaned up.  (We will issue the
+          * "already wrapped" warning if appropriate, though.)
+          */
+         if (TransactionIdPrecedes(lastSaneFrozenXid, dbform->datfrozenxid) ||
+             MultiXactIdPrecedes(lastSaneMinMulti, dbform->datminmxid))
+             bogus = true;
+
          if (TransactionIdPrecedes(myXID, dbform->datfrozenxid))
              frozenAlreadyWrapped = true;
          else if (TransactionIdPrecedes(dbform->datfrozenxid, frozenXID))
*************** vac_truncate_clog(TransactionId frozenXI
*** 969,974 ****
--- 1038,1047 ----
          return;
      }

+     /* chicken out if data is bogus in any other way */
+     if (bogus)
+         return;
+
      /*
       * Truncate CLOG to the oldest computed value.  Note we don't truncate
       * multixacts; that will be done by the next checkpoint.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: pg_upgrade < 9.3 -> >=9.3 misses a step around multixacts
Следующее
От: "Burgess, Freddie"
Дата:
Сообщение: PostgreSQL 9.3.4 Query Problems