Обсуждение: Another bug introduced by fastpath patch

Поиск
Список
Период
Сортировка

Another bug introduced by fastpath patch

От
Tom Lane
Дата:
In LockReleaseAll, we have this coding:
   for (partition = 0; partition < NUM_LOCK_PARTITIONS; partition++)   {       LWLockId    partitionLock =
FirstLockMgrLock+ partition;       SHM_QUEUE  *procLocks = &(MyProc->myProcLocks[partition]);
 
       proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
offsetof(PROCLOCK,procLink));
 
       if (!proclock)           continue;            /* needn't examine this partition */
       LWLockAcquire(partitionLock, LW_EXCLUSIVE);
       ... process the proclock list ...
       LWLockRelease(partitionLock);   }                            /* loop over partitions */


That is, we check the proclock list head before acquiring the matching
partitionLock, and assume we can skip scanning this partition if the list
is empty.  Once upon a time that was safe, but since the fast-path patch
it's not very safe, because some other backend could be in process of
promoting one of our fast-path locks (and hence changing the proclock
list).  If we could be sure that we had removed *all* our fastpath locks
in the earlier loop then it'd be okay, but because LockReleaseAll's name
is a bit of a misnomer, I don't think we can assume that.

The simplest fix is just to move the fetch of the list header down
past the LWLockAcquire.  This is pretty annoying, though, because in
typical small transactions that will require a lot of additional
partition lock acquire/releases.

I think it might be safe to check the list header, skip the partition
if it's null, but if it isn't then acquire the lock *and re-fetch the
list header*.  The idea here is that if there is someone else busy
promoting one of our fastpath locks, it must be a lock we'd decided not
to delete in the previous loop, and so we would again decide not to
delete it in the main loop.  Therefore, it doesn't matter if we decide
to skip a partition microseconds before somebody changes the list header
from null to not-null.  However, once we've decided to process a
partition, we'd better get an up-to-date view of the list state.

This does do some damage to the original concept that allLocks mode
would guarantee to clean up all locks of the target lockmethod even
if the locallock table was damaged.  I wonder whether we shouldn't
get rid of the existing logic about fastpath in the loop over locallocks
entirely, and replace it with code that just zeroes the fastpath array
when lockmethodid == DEFAULT_LOCKMETHOD and allLocks.  That would likely
be faster than what you've got here as well as more robust.  Or we
could add a restriction to EligibleForRelationFastPath that restricts
the fast-path mechanism to non-session locks, in which case we'd not
need to make the zeroing contingent on allLocks either.  I don't think
we take any fast-path-eligible locks in session mode anyway, so this
wouldn't be giving up any functionality on that end.

Comments?
        regards, tom lane



Re: Another bug introduced by fastpath patch

От
Andres Freund
Дата:
On 2013-11-27 17:25:44 -0500, Tom Lane wrote:
> Or we
> could add a restriction to EligibleForRelationFastPath that restricts
> the fast-path mechanism to non-session locks, in which case we'd not
> need to make the zeroing contingent on allLocks either.  I don't think
> we take any fast-path-eligible locks in session mode anyway, so this
> wouldn't be giving up any functionality on that end.

That seems like the best thing to do to me.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Another bug introduced by fastpath patch

От
Tom Lane
Дата:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-11-27 17:25:44 -0500, Tom Lane wrote:
>> Or we
>> could add a restriction to EligibleForRelationFastPath that restricts
>> the fast-path mechanism to non-session locks, in which case we'd not
>> need to make the zeroing contingent on allLocks either.  I don't think
>> we take any fast-path-eligible locks in session mode anyway, so this
>> wouldn't be giving up any functionality on that end.

> That seems like the best thing to do to me.

I poked a bit more at this and realized that what I had in mind above
doesn't work: a session lock isn't a distinct lock type, just a different
owner, and moreover usually we ask for a session lock on a rel that we
already have locked normally.  So it doesn't work to say "skip trying fast
path if sessionLock" --- we typically won't get that far in the code.
We could conceivably make it work if we were willing to forcibly promote
an existing fastpath lock to regular when a session lock gets asked for,
but that's complication I don't want much, especially since it would go
completely untested in normal use.  (I verified the claim above that we
never take session locks on any fast-path-eligible lock types.)

We could still do this if we were willing to actually reject requests
for session level locks on fast-path-eligible lock types.  At the moment
that costs us nothing really.  If anyone ever did have a use case, we
could consider adding the extra logic to support it.
        regards, tom lane



Re: Another bug introduced by fastpath patch

От
Tom Lane
Дата:
I wrote:
> We could still do this if we were willing to actually reject requests
> for session level locks on fast-path-eligible lock types.  At the moment
> that costs us nothing really.  If anyone ever did have a use case, we
> could consider adding the extra logic to support it.

Nope, that *still* doesn't work, because in non-allLocks mode the main
loop won't clear any locks that have been promoted from fastpath to
regular.  Sigh.  For the moment I'm proposing that we just re-fetch
the list header after acquiring the lock.  The attached patch is slightly
more verbose than that, because I took the opportunity to reformulate the
while() loop as a for() loop and thereby eliminate some goto's.

            regards, tom lane

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 072b276..75df457 100644
*** a/src/backend/storage/lmgr/lock.c
--- b/src/backend/storage/lmgr/lock.c
*************** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2098,2116 ****
      {
          LWLockId    partitionLock = FirstLockMgrLock + partition;
          SHM_QUEUE  *procLocks = &(MyProc->myProcLocks[partition]);

!         proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
!                                              offsetof(PROCLOCK, procLink));
!
!         if (!proclock)
              continue;            /* needn't examine this partition */

          LWLockAcquire(partitionLock, LW_EXCLUSIVE);

!         while (proclock)
          {
              bool        wakeupNeeded = false;
-             PROCLOCK   *nextplock;

              /* Get link first, since we may unlink/delete this proclock */
              nextplock = (PROCLOCK *)
--- 2098,2134 ----
      {
          LWLockId    partitionLock = FirstLockMgrLock + partition;
          SHM_QUEUE  *procLocks = &(MyProc->myProcLocks[partition]);
+         PROCLOCK   *nextplock;

!         /*
!          * If the proclock list for this partition is empty, we can skip
!          * acquiring the partition lock.  This optimization is trickier than
!          * it looks, because another backend could be in process of adding
!          * something to our proclock list due to promoting one of our
!          * fast-path locks.  However, any such lock must be one that we
!          * decided not to delete above, so it's okay to skip it again now;
!          * we'd just decide not to delete it again.  We must, however, be
!          * careful to re-fetch the list header once we've acquired the
!          * partition lock, to be sure we see the up to date version.
!          *
!          * XXX This argument assumes that the locallock table correctly
!          * represents all of our fast-path locks.  While allLocks mode
!          * guarantees to clean up all of our normal locks regardless of the
!          * locallock situation, we lose that guarantee for fast-path locks.
!          * This is not ideal.
!          */
!         if (SHMQueueNext(procLocks, procLocks,
!                          offsetof(PROCLOCK, procLink)) == NULL)
              continue;            /* needn't examine this partition */

          LWLockAcquire(partitionLock, LW_EXCLUSIVE);

!         for (proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
!                                                offsetof(PROCLOCK, procLink));
!              proclock;
!              proclock = nextplock)
          {
              bool        wakeupNeeded = false;

              /* Get link first, since we may unlink/delete this proclock */
              nextplock = (PROCLOCK *)
*************** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2123,2129 ****

              /* Ignore items that are not of the lockmethod to be removed */
              if (LOCK_LOCKMETHOD(*lock) != lockmethodid)
!                 goto next_item;

              /*
               * In allLocks mode, force release of all locks even if locallock
--- 2141,2147 ----

              /* Ignore items that are not of the lockmethod to be removed */
              if (LOCK_LOCKMETHOD(*lock) != lockmethodid)
!                 continue;

              /*
               * In allLocks mode, force release of all locks even if locallock
*************** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2139,2145 ****
               * holdMask == 0 and are therefore recyclable
               */
              if (proclock->releaseMask == 0 && proclock->holdMask != 0)
!                 goto next_item;

              PROCLOCK_PRINT("LockReleaseAll", proclock);
              LOCK_PRINT("LockReleaseAll", lock, 0);
--- 2157,2163 ----
               * holdMask == 0 and are therefore recyclable
               */
              if (proclock->releaseMask == 0 && proclock->holdMask != 0)
!                 continue;

              PROCLOCK_PRINT("LockReleaseAll", proclock);
              LOCK_PRINT("LockReleaseAll", lock, 0);
*************** LockReleaseAll(LOCKMETHODID lockmethodid
*** 2168,2176 ****
                          lockMethodTable,
                          LockTagHashCode(&lock->tag),
                          wakeupNeeded);
-
-     next_item:
-             proclock = nextplock;
          }                        /* loop over PROCLOCKs within this partition */

          LWLockRelease(partitionLock);
--- 2186,2191 ----
*************** PostPrepare_Locks(TransactionId xid)
*** 3142,3160 ****
      {
          LWLockId    partitionLock = FirstLockMgrLock + partition;
          SHM_QUEUE  *procLocks = &(MyProc->myProcLocks[partition]);

!         proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
!                                              offsetof(PROCLOCK, procLink));
!
!         if (!proclock)
              continue;            /* needn't examine this partition */

          LWLockAcquire(partitionLock, LW_EXCLUSIVE);

!         while (proclock)
          {
-             PROCLOCK   *nextplock;
-
              /* Get link first, since we may unlink/relink this proclock */
              nextplock = (PROCLOCK *)
                  SHMQueueNext(procLocks, &proclock->procLink,
--- 3157,3183 ----
      {
          LWLockId    partitionLock = FirstLockMgrLock + partition;
          SHM_QUEUE  *procLocks = &(MyProc->myProcLocks[partition]);
+         PROCLOCK   *nextplock;

!         /*
!          * If the proclock list for this partition is empty, we can skip
!          * acquiring the partition lock.  This optimization is safer than the
!          * situation in LockReleaseAll, because we got rid of any fast-path
!          * locks during AtPrepare_Locks, so there cannot be any case where
!          * another backend is adding something to our lists now.  For safety,
!          * though, we code this the same way as in LockReleaseAll.
!          */
!         if (SHMQueueNext(procLocks, procLocks,
!                          offsetof(PROCLOCK, procLink)) == NULL)
              continue;            /* needn't examine this partition */

          LWLockAcquire(partitionLock, LW_EXCLUSIVE);

!         for (proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
!                                                offsetof(PROCLOCK, procLink));
!              proclock;
!              proclock = nextplock)
          {
              /* Get link first, since we may unlink/relink this proclock */
              nextplock = (PROCLOCK *)
                  SHMQueueNext(procLocks, &proclock->procLink,
*************** PostPrepare_Locks(TransactionId xid)
*** 3166,3172 ****

              /* Ignore VXID locks */
              if (lock->tag.locktag_type == LOCKTAG_VIRTUALTRANSACTION)
!                 goto next_item;

              PROCLOCK_PRINT("PostPrepare_Locks", proclock);
              LOCK_PRINT("PostPrepare_Locks", lock, 0);
--- 3189,3195 ----

              /* Ignore VXID locks */
              if (lock->tag.locktag_type == LOCKTAG_VIRTUALTRANSACTION)
!                 continue;

              PROCLOCK_PRINT("PostPrepare_Locks", proclock);
              LOCK_PRINT("PostPrepare_Locks", lock, 0);
*************** PostPrepare_Locks(TransactionId xid)
*** 3177,3183 ****

              /* Ignore it if nothing to release (must be a session lock) */
              if (proclock->releaseMask == 0)
!                 goto next_item;

              /* Else we should be releasing all locks */
              if (proclock->releaseMask != proclock->holdMask)
--- 3200,3206 ----

              /* Ignore it if nothing to release (must be a session lock) */
              if (proclock->releaseMask == 0)
!                 continue;

              /* Else we should be releasing all locks */
              if (proclock->releaseMask != proclock->holdMask)
*************** PostPrepare_Locks(TransactionId xid)
*** 3219,3227 ****
                                   &proclock->procLink);

              PROCLOCK_PRINT("PostPrepare_Locks: updated", proclock);
-
-     next_item:
-             proclock = nextplock;
          }                        /* loop over PROCLOCKs within this partition */

          LWLockRelease(partitionLock);
--- 3242,3247 ----

Re: Another bug introduced by fastpath patch

От
Robert Haas
Дата:
On Wed, Nov 27, 2013 at 8:21 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> We could still do this if we were willing to actually reject requests
>> for session level locks on fast-path-eligible lock types.  At the moment
>> that costs us nothing really.  If anyone ever did have a use case, we
>> could consider adding the extra logic to support it.
>
> Nope, that *still* doesn't work, because in non-allLocks mode the main
> loop won't clear any locks that have been promoted from fastpath to
> regular.  Sigh.  For the moment I'm proposing that we just re-fetch
> the list header after acquiring the lock.  The attached patch is slightly
> more verbose than that, because I took the opportunity to reformulate the
> while() loop as a for() loop and thereby eliminate some goto's.

This approach looks good to me.  When I originally worked on this, I
struggled quite a bit with the idea that LockReleaseAll() might really
mean LockReleaseSome().  I thought I'd come up with an approach that
was adequate, but evidently not entirely.

In terms of making this more robust, one idea - along the lines you
mentioned in your original post - is to have a separate code path for
the case where we're releasing *all* locks.  Such a mode could simply
acquire backendLock, clear fpLockBits and fpVXIDLock, and release the
backendLock.  After that, we can just walk through all the PROCLOCK
lists (nothing can be getting added concurrently, since we no longer
have any fast-path locks for someone else to transfer) and nuke
everything we find there.  This code could be used not only in the
allLocks = true case but also whenever we don't hold any session
locks, which is nearly all the time.  (I'm here assuming that we have
some cheap way of knowing whether that's the case, but that shouldn't
be too hard to contrive.)

Of course, that doesn't solve the problem of what to do when we do
hold session locks and want to release everything else.  I'm not sure
there's an easy option that's any better than what we have now.  The
problem of releasing all of our transaction-lifespan locks while
holding on to session-lifespan locks seems related to the problem of
releasing locks for an aborted subtransaction while holding onto the
parent's locks, but those are handled in completely different ways.
We could rip out the non-allLocks case from LockReleaseAll()
altogether and rely on retail lock release in all cases where we are
not releasing *everything*, but that would probably be significantly
worse in cases where we hold one session lifespan lock plus a large
number of transaction-lifespan locks.  That's probably a fairly rare
case, but I'm not sure the benefits in maintainability are enough to
justify such a change.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Another bug introduced by fastpath patch

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> In terms of making this more robust, one idea - along the lines you
> mentioned in your original post - is to have a separate code path for
> the case where we're releasing *all* locks.

Yeah, I thought seriously about that, but concluded that it was too
debatable for a back-patch.  The potential robustness gain from having
guaranteed lock clearing might be offset by the potential for bugs
in one or the other of separate code paths.  Also, we're not going to
get very far unless we refactor LockReleaseAll so that we're not making
two separate DEFAULT_LOCKMETHOD and USER_LOCKMETHOD calls in every
transaction end.  (Or maybe we could fix things so that we remember
if we have any USER locks, and skip the second call if not?)

BTW, after further thought I can refine the argument why this patch is
needed.  The risk is that we might fetch the list pointer while someone
else is adding formerly-fastpath locks to that list.  Now, by the same
argument as before, any such just-added entries aren't ones that we need
to visit, so starting at the old list head isn't really a problem.
(Though it might become one if we ever switch to a different list
traversal technology here.)  The only remaining risk is that, if pointer
fetch/store isn't atomic, we might fetch a half-updated pointer; which
will be non-null, but not something we can use to reach the list.  Since
we do purport to support such architectures, we'd better apply the patch.
I'll change the comment a bit to mention this.
        regards, tom lane



Re: Another bug introduced by fastpath patch

От
Andres Freund
Дата:
On 2013-11-28 10:31:21 -0500, Tom Lane wrote:
> The only remaining risk is that, if pointer
> fetch/store isn't atomic, we might fetch a half-updated pointer; which
> will be non-null, but not something we can use to reach the list.  Since
> we do purport to support such architectures, we'd better apply the patch.
> I'll change the comment a bit to mention this.

We do support such architectures? Don't we already assume we can store
xids atomically (c.f. GetOldestActiveTransactionId())? Do we support a
64bit arch, that has a atomic 4byte store, but not atomic 8byte stores?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Another bug introduced by fastpath patch

От
Tom Lane
Дата:
Andres Freund <andres@2ndquadrant.com> writes:
> On 2013-11-28 10:31:21 -0500, Tom Lane wrote:
>> The only remaining risk is that, if pointer
>> fetch/store isn't atomic, we might fetch a half-updated pointer; which
>> will be non-null, but not something we can use to reach the list.  Since
>> we do purport to support such architectures, we'd better apply the patch.

> We do support such architectures? Don't we already assume we can store
> xids atomically (c.f. GetOldestActiveTransactionId())? Do we support a
> 64bit arch, that has a atomic 4byte store, but not atomic 8byte stores?

Dunno whether there are any in practice, but it's not an assumption
we make anywhere.
        regards, tom lane