Re: [BUGS] BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [BUGS] BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)
Дата
Msg-id 6382.1317081585@sss.pgh.pa.us
обсуждение исходный текст
Ответы Re: [BUGS] BUG #6218: TRAP: FailedAssertion( "!(owner->nsnapshots == 0)", File: "resowner.c", Line: 365)
Список pgsql-hackers
I wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
>> Shall I work on a fix?  I expect you are plenty busy with commitfest
>> stuff, but please let me know otherwise.

> I have what-I-think-is-the-fix pretty clear in my own mind, so let me
> give it a try.  If it doesn't work I'll bounce it back to you.

Well, I soon ran into the issue that delaying the snapshot release makes
TopTransactionResourceOwner spit up.  After some reflection I decided
that the real problem is a circular dependency: snapshot management must
be considered lower-level than ResourceOwners because ResourceOwners
tell snapshot management what to do, but here we have
GetTransactionSnapshot trying to use TopTransactionResourceOwner to
manage its internal reference to the transaction snapshot.

Accordingly, the attached proposed patch gets rid of the circularity
by removing snapmgr.c's dependency on TopTransactionResourceOwner,
in favor of having it track the refcount "manually".  This was messier
than I'd hoped because the bogus design had propagated into the SSI
manager meanwhile, but removing the TopTransactionResourceOwner
dependency from that too seems like a good idea.

This passes the regular and isolation regression tests, and it's also
okay with Yamamoto-san's prepared-ROLLBACK test case even without the
band-aid fix in plancache.c.  I can't immediately think of any other
test cases to throw at it.

Comments?

            regards, tom lane

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index de3f965b37a4be93913907b1683c07ea24b4e633..3dab45c2da60a1010d566fccf658a4a55ee3ece3 100644
*** a/src/backend/access/transam/xact.c
--- b/src/backend/access/transam/xact.c
*************** CommitTransaction(void)
*** 1906,1914 ****
      /* Clean up the relation cache */
      AtEOXact_RelationCache(true);

-     /* Clean up the snapshot manager */
-     AtEarlyCommit_Snapshot();
-
      /*
       * Make catalog changes visible to all backends.  This has to happen after
       * relcache references are dropped (see comments for
--- 1906,1911 ----
*************** PrepareTransaction(void)
*** 2158,2166 ****
      /* Clean up the relation cache */
      AtEOXact_RelationCache(true);

-     /* Clean up the snapshot manager */
-     AtEarlyCommit_Snapshot();
-
      /* notify doesn't need a postprepare call */

      PostPrepare_PgStat();
--- 2155,2160 ----
*************** AbortTransaction(void)
*** 2339,2345 ****
          AtEOXact_ComboCid();
          AtEOXact_HashTables(false);
          AtEOXact_PgStat(false);
-         AtEOXact_Snapshot(false);
          pgstat_report_xact_timestamp(0);
      }

--- 2333,2338 ----
*************** CleanupTransaction(void)
*** 2368,2373 ****
--- 2361,2367 ----
       * do abort cleanup processing
       */
      AtCleanup_Portals();        /* now safe to release portal memory */
+     AtEOXact_Snapshot(false);    /* and release the transaction's snapshots */

      CurrentResourceOwner = NULL;    /* and resource owner */
      if (TopTransactionResourceOwner)
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 1754180453daaa54e732bcd82cae5f3b70631524..d39f8975f8816d9bba02c0c398323ca470f1340b 100644
*** a/src/backend/storage/lmgr/predicate.c
--- b/src/backend/storage/lmgr/predicate.c
***************
*** 146,152 ****
   *        PageIsPredicateLocked(Relation relation, BlockNumber blkno)
   *
   * predicate lock maintenance
!  *        RegisterSerializableTransaction(Snapshot snapshot)
   *        RegisterPredicateLockingXid(void)
   *        PredicateLockRelation(Relation relation, Snapshot snapshot)
   *        PredicateLockPage(Relation relation, BlockNumber blkno,
--- 146,152 ----
   *        PageIsPredicateLocked(Relation relation, BlockNumber blkno)
   *
   * predicate lock maintenance
!  *        GetSerializableTransactionSnapshot(Snapshot snapshot)
   *        RegisterPredicateLockingXid(void)
   *        PredicateLockRelation(Relation relation, Snapshot snapshot)
   *        PredicateLockPage(Relation relation, BlockNumber blkno,
*************** static void OldSerXidSetActiveSerXmin(Tr
*** 417,423 ****
  static uint32 predicatelock_hash(const void *key, Size keysize);
  static void SummarizeOldestCommittedSxact(void);
  static Snapshot GetSafeSnapshot(Snapshot snapshot);
! static Snapshot RegisterSerializableTransactionInt(Snapshot snapshot);
  static bool PredicateLockExists(const PREDICATELOCKTARGETTAG *targettag);
  static bool GetParentPredicateLockTag(const PREDICATELOCKTARGETTAG *tag,
                            PREDICATELOCKTARGETTAG *parent);
--- 417,423 ----
  static uint32 predicatelock_hash(const void *key, Size keysize);
  static void SummarizeOldestCommittedSxact(void);
  static Snapshot GetSafeSnapshot(Snapshot snapshot);
! static Snapshot GetSerializableTransactionSnapshotInt(Snapshot snapshot);
  static bool PredicateLockExists(const PREDICATELOCKTARGETTAG *targettag);
  static bool GetParentPredicateLockTag(const PREDICATELOCKTARGETTAG *tag,
                            PREDICATELOCKTARGETTAG *parent);
*************** SummarizeOldestCommittedSxact(void)
*** 1485,1490 ****
--- 1485,1494 ----
   *        without further checks. This requires waiting for concurrent
   *        transactions to complete, and retrying with a new snapshot if
   *        one of them could possibly create a conflict.
+  *
+  *        As with GetSerializableTransactionSnapshot (which this is a subroutine
+  *        for), the passed-in Snapshot pointer should reference a static data
+  *        area that can safely be passed to GetSnapshotData.
   */
  static Snapshot
  GetSafeSnapshot(Snapshot origSnapshot)
*************** GetSafeSnapshot(Snapshot origSnapshot)
*** 1496,1507 ****
      while (true)
      {
          /*
!          * RegisterSerializableTransactionInt is going to call
!          * GetSnapshotData, so we need to provide it the static snapshot our
!          * caller passed to us. It returns a copy of that snapshot and
!          * registers it on TopTransactionResourceOwner.
           */
!         snapshot = RegisterSerializableTransactionInt(origSnapshot);

          if (MySerializableXact == InvalidSerializableXact)
              return snapshot;    /* no concurrent r/w xacts; it's safe */
--- 1500,1511 ----
      while (true)
      {
          /*
!          * GetSerializableTransactionSnapshotInt is going to call
!          * GetSnapshotData, so we need to provide it the static snapshot area
!          * our caller passed to us.  The pointer returned is actually the same
!          * one passed to it, but we avoid assuming that here.
           */
!         snapshot = GetSerializableTransactionSnapshotInt(origSnapshot);

          if (MySerializableXact == InvalidSerializableXact)
              return snapshot;    /* no concurrent r/w xacts; it's safe */
*************** GetSafeSnapshot(Snapshot origSnapshot)
*** 1535,1542 ****
                  (errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
                   errmsg("deferrable snapshot was unsafe; trying a new one")));
          ReleasePredicateLocks(false);
-         UnregisterSnapshotFromOwner(snapshot,
-                                     TopTransactionResourceOwner);
      }

      /*
--- 1539,1544 ----
*************** GetSafeSnapshot(Snapshot origSnapshot)
*** 1549,1576 ****
  }

  /*
!  * Acquire and register a snapshot which can be used for this transaction..
   * Make sure we have a SERIALIZABLEXACT reference in MySerializableXact.
   * It should be current for this process and be contained in PredXact.
   */
  Snapshot
! RegisterSerializableTransaction(Snapshot snapshot)
  {
      Assert(IsolationIsSerializable());

      /*
       * A special optimization is available for SERIALIZABLE READ ONLY
       * DEFERRABLE transactions -- we can wait for a suitable snapshot and
!      * thereby avoid all SSI overhead once it's running..
       */
      if (XactReadOnly && XactDeferrable)
          return GetSafeSnapshot(snapshot);

!     return RegisterSerializableTransactionInt(snapshot);
  }

  static Snapshot
! RegisterSerializableTransactionInt(Snapshot snapshot)
  {
      PGPROC       *proc;
      VirtualTransactionId vxid;
--- 1551,1584 ----
  }

  /*
!  * Acquire a snapshot that can be used for the current transaction.
!  *
   * Make sure we have a SERIALIZABLEXACT reference in MySerializableXact.
   * It should be current for this process and be contained in PredXact.
+  *
+  * The passed-in Snapshot pointer should reference a static data area that
+  * can safely be passed to GetSnapshotData.  The return value is actually
+  * always this same pointer; no new snapshot data structure is allocated
+  * within this function.
   */
  Snapshot
! GetSerializableTransactionSnapshot(Snapshot snapshot)
  {
      Assert(IsolationIsSerializable());

      /*
       * A special optimization is available for SERIALIZABLE READ ONLY
       * DEFERRABLE transactions -- we can wait for a suitable snapshot and
!      * thereby avoid all SSI overhead once it's running.
       */
      if (XactReadOnly && XactDeferrable)
          return GetSafeSnapshot(snapshot);

!     return GetSerializableTransactionSnapshotInt(snapshot);
  }

  static Snapshot
! GetSerializableTransactionSnapshotInt(Snapshot snapshot)
  {
      PGPROC       *proc;
      VirtualTransactionId vxid;
*************** RegisterSerializableTransactionInt(Snaps
*** 1607,1615 ****
          }
      } while (!sxact);

!     /* Get and register a snapshot */
      snapshot = GetSnapshotData(snapshot);
-     snapshot = RegisterSnapshotOnOwner(snapshot, TopTransactionResourceOwner);

      /*
       * If there are no serializable transactions which are not read-only, we
--- 1615,1622 ----
          }
      } while (!sxact);

!     /* Get the snapshot */
      snapshot = GetSnapshotData(snapshot);

      /*
       * If there are no serializable transactions which are not read-only, we
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index bb25ac6ab2c464480258335297ff4500c049bda6..91229cd21bbc43fc27670c65ca7c8246668b6434 100644
*** a/src/backend/utils/time/snapmgr.c
--- b/src/backend/utils/time/snapmgr.c
***************
*** 7,12 ****
--- 7,20 ----
   * persistent memory.  When a snapshot is no longer in any of these lists
   * (tracked by separate refcounts on each snapshot), its memory can be freed.
   *
+  * The FirstXactSnapshot, if any, is treated a bit specially: we increment its
+  * regd_count and count it in RegisteredSnapshots, but this reference is not
+  * tracked by a resource owner.  We used to use the TopTransactionResourceOwner
+  * to track this snapshot reference, but that introduces logical circularity
+  * and thus makes it impossible to clean up in a sane fashion.  It's better to
+  * handle this reference as an internally-tracked registration, so that this
+  * module is entirely lower-level than ResourceOwners.
+  *
   * These arrangements let us reset MyProc->xmin when there are no snapshots
   * referenced by this transaction.    (One possible improvement would be to be
   * able to advance Xmin when the snapshot with the earliest Xmin is no longer
*************** static int    RegisteredSnapshots = 0;
*** 97,107 ****
  bool        FirstSnapshotSet = false;

  /*
!  * Remembers whether this transaction registered a transaction snapshot at
!  * start.  We cannot trust FirstSnapshotSet in combination with
!  * IsolationUsesXactSnapshot(), because GUC may be reset before us.
   */
! static bool registered_xact_snapshot = false;


  static Snapshot CopySnapshot(Snapshot snapshot);
--- 105,115 ----
  bool        FirstSnapshotSet = false;

  /*
!  * Remember the serializable transaction snapshot, if any.  We cannot trust
!  * FirstSnapshotSet in combination with IsolationUsesXactSnapshot(), because
!  * GUC may be reset before us, changing the value of IsolationUsesXactSnapshot.
   */
! static Snapshot FirstXactSnapshot = NULL;


  static Snapshot CopySnapshot(Snapshot snapshot);
*************** GetTransactionSnapshot(void)
*** 125,147 ****
      if (!FirstSnapshotSet)
      {
          Assert(RegisteredSnapshots == 0);

          /*
           * In transaction-snapshot mode, the first snapshot must live until
           * end of xact regardless of what the caller does with it, so we must
!          * register it internally here and unregister it at end of xact.
           */
          if (IsolationUsesXactSnapshot())
          {
              if (IsolationIsSerializable())
!                 CurrentSnapshot = RegisterSerializableTransaction(&CurrentSnapshotData);
              else
-             {
                  CurrentSnapshot = GetSnapshotData(&CurrentSnapshotData);
!                 CurrentSnapshot = RegisterSnapshotOnOwner(CurrentSnapshot,
!                                                 TopTransactionResourceOwner);
!             }
!             registered_xact_snapshot = true;
          }
          else
              CurrentSnapshot = GetSnapshotData(&CurrentSnapshotData);
--- 133,159 ----
      if (!FirstSnapshotSet)
      {
          Assert(RegisteredSnapshots == 0);
+         Assert(FirstXactSnapshot == NULL);

          /*
           * In transaction-snapshot mode, the first snapshot must live until
           * end of xact regardless of what the caller does with it, so we must
!          * make a copy of it rather than returning CurrentSnapshotData
!          * directly.
           */
          if (IsolationUsesXactSnapshot())
          {
+             /* First, create the snapshot in CurrentSnapshotData */
              if (IsolationIsSerializable())
!                 CurrentSnapshot = GetSerializableTransactionSnapshot(&CurrentSnapshotData);
              else
                  CurrentSnapshot = GetSnapshotData(&CurrentSnapshotData);
!             /* Make a saved copy */
!             CurrentSnapshot = CopySnapshot(CurrentSnapshot);
!             FirstXactSnapshot = CurrentSnapshot;
!             /* Mark it as "registered" in FirstXactSnapshot */
!             FirstXactSnapshot->regd_count++;
!             RegisteredSnapshots++;
          }
          else
              CurrentSnapshot = GetSnapshotData(&CurrentSnapshotData);
*************** AtSubAbort_Snapshot(int level)
*** 523,554 ****
  }

  /*
-  * AtEarlyCommit_Snapshot
-  *
-  * Snapshot manager's cleanup function, to be called on commit, before
-  * doing resowner.c resource release.
-  */
- void
- AtEarlyCommit_Snapshot(void)
- {
-     /*
-      * In transaction-snapshot mode we must unregister our private refcount to
-      * the transaction-snapshot.
-      */
-     if (registered_xact_snapshot)
-         UnregisterSnapshotFromOwner(CurrentSnapshot,
-                                     TopTransactionResourceOwner);
-     registered_xact_snapshot = false;
-
- }
-
- /*
   * AtEOXact_Snapshot
   *        Snapshot manager's cleanup function for end of transaction
   */
  void
  AtEOXact_Snapshot(bool isCommit)
  {
      /* On commit, complain about leftover snapshots */
      if (isCommit)
      {
--- 535,563 ----
  }

  /*
   * AtEOXact_Snapshot
   *        Snapshot manager's cleanup function for end of transaction
   */
  void
  AtEOXact_Snapshot(bool isCommit)
  {
+     /*
+      * In transaction-snapshot mode we must release our privately-managed
+      * reference to the transaction snapshot.  We must decrement
+      * RegisteredSnapshots to keep the check below happy.  But we don't bother
+      * to do FreeSnapshot, for two reasons: the memory will go away with
+      * TopTransactionContext anyway, and if someone has left the snapshot
+      * stacked as active, we don't want the code below to be chasing through
+      * a dangling pointer.
+      */
+     if (FirstXactSnapshot != NULL)
+     {
+         Assert(FirstXactSnapshot->regd_count > 0);
+         Assert(RegisteredSnapshots > 0);
+         RegisteredSnapshots--;
+     }
+     FirstXactSnapshot = NULL;
+
      /* On commit, complain about leftover snapshots */
      if (isCommit)
      {
*************** AtEOXact_Snapshot(bool isCommit)
*** 574,578 ****
      SecondarySnapshot = NULL;

      FirstSnapshotSet = false;
!     registered_xact_snapshot = false;
  }
--- 583,588 ----
      SecondarySnapshot = NULL;

      FirstSnapshotSet = false;
!
!     SnapshotResetXmin();
  }
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index 5ddbc1d3f413506fcd4dff851a5960c112f227ea..9603b10ad4052e55109b976d521d6054b656dc8a 100644
*** a/src/include/storage/predicate.h
--- b/src/include/storage/predicate.h
*************** extern void CheckPointPredicate(void);
*** 42,48 ****
  extern bool PageIsPredicateLocked(Relation relation, BlockNumber blkno);

  /* predicate lock maintenance */
! extern Snapshot RegisterSerializableTransaction(Snapshot snapshot);
  extern void RegisterPredicateLockingXid(TransactionId xid);
  extern void PredicateLockRelation(Relation relation, Snapshot snapshot);
  extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot);
--- 42,48 ----
  extern bool PageIsPredicateLocked(Relation relation, BlockNumber blkno);

  /* predicate lock maintenance */
! extern Snapshot GetSerializableTransactionSnapshot(Snapshot snapshot);
  extern void RegisterPredicateLockingXid(TransactionId xid);
  extern void PredicateLockRelation(Relation relation, Snapshot snapshot);
  extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot);
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index c969a37796173bdab05925d3a30364b3fe1bde7e..e665a28aff8571e50ce318d8a0ca4710ec503b3c 100644
*** a/src/include/utils/snapmgr.h
--- b/src/include/utils/snapmgr.h
*************** extern void UnregisterSnapshotFromOwner(
*** 40,46 ****

  extern void AtSubCommit_Snapshot(int level);
  extern void AtSubAbort_Snapshot(int level);
- extern void AtEarlyCommit_Snapshot(void);
  extern void AtEOXact_Snapshot(bool isCommit);

  #endif   /* SNAPMGR_H */
--- 40,45 ----

В списке pgsql-hackers по дате отправления:

Предыдущее
От: panam
Дата:
Сообщение: Re: fix for pg_upgrade
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: fix for pg_upgrade