Re: Condition variable live lock

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Condition variable live lock
Дата
Msg-id 17054.1515126429@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Condition variable live lock  (Andres Freund <andres@anarazel.de>)
Ответы Re: Condition variable live lock  (Thomas Munro <thomas.munro@enterprisedb.com>)
Список pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2018-01-04 12:39:47 -0500, Robert Haas wrote:
>>> Given that the proclist_contains() checks in condition_variable.c are
>>> already racy, I think it might be feasible to collect all procnos to
>>> signal while holding the spinlock, and then signal all of them in one
>>> go.

>> That doesn't seem very nice at all.  Not only does it violate the
>> coding rule against looping while holding a spinlock, but it seems
>> that it would require allocating memory while holding one, which is a
>> non-starter.

> We could just use a sufficiently sized buffer beforehand. There's an
> obvious upper boundary, so that shouldn't be a big issue.

I share Robert's discomfort with that solution, but it seems to me there
might be a better way.  The attached patch uses our own cvWaitLink as a
sentinel to detect when we've woken everybody who was on the wait list
before we arrived.  That gives exactly the desired semantics, not just an
approximation to them.

Now, the limitation with this is that we can't be waiting for any *other*
condition variable, because then we'd be trashing our state about that
variable.  As coded, we can't be waiting for the target CV either, but
that case could actually be handled if we needed to, as per the comment.
I do not know if this is likely to be a problematic limitation
... discuss.  (The patch does survive check-world, FWIW.)

            regards, tom lane

diff --git a/src/backend/storage/lmgr/condition_variable.c b/src/backend/storage/lmgr/condition_variable.c
index 41378d6..0d08eb5 100644
*** a/src/backend/storage/lmgr/condition_variable.c
--- b/src/backend/storage/lmgr/condition_variable.c
*************** int
*** 214,228 ****
  ConditionVariableBroadcast(ConditionVariable *cv)
  {
      int            nwoken = 0;

      /*
!      * Let's just do this the dumbest way possible.  We could try to dequeue
!      * all the sleepers at once to save spinlock cycles, but it's a bit hard
!      * to get that right in the face of possible sleep cancelations, and we
!      * don't want to loop holding the mutex.
       */
!     while (ConditionVariableSignal(cv))
          ++nwoken;

      return nwoken;
  }
--- 214,273 ----
  ConditionVariableBroadcast(ConditionVariable *cv)
  {
      int            nwoken = 0;
+     int            pgprocno = MyProc->pgprocno;

      /*
!      * In some use-cases, it is common for awakened processes to immediately
!      * re-queue themselves.  If we just naively try to reduce the wakeup list
!      * to empty, we'll get into a potentially-indefinite loop against such a
!      * process.  The semantics we really want are just to be sure that we have
!      * wakened all processes that were in the list at entry.  We can use our
!      * own cvWaitLink as a sentinel to detect when we've finished.
!      *
!      * A seeming flaw in this approach is that someone else might signal the
!      * CV and in doing so remove our sentinel entry.  But that's fine: since
!      * CV waiters are always added and removed in order, that must mean that
!      * every previous waiter has been wakened, so we're done.  We'd get an
!      * extra "set" on our latch, which is slightly inefficient but harmless.
!      *
!      * This coding assumes (and asserts) that we never broadcast on a CV that
!      * we are also waiting for.  If that turns out to be necessary, we could
!      * just remove our entry and insert it at the end, and it still works as a
!      * sentinel.
       */
!
!     /* We should not be attempting a CV wait. */
!     Assert(cv_sleep_target == NULL);
!
!     /* Establish the sentinel entry. */
!     SpinLockAcquire(&cv->mutex);
!     Assert(!proclist_contains(&cv->wakeup, pgprocno, cvWaitLink));
!     proclist_push_tail(&cv->wakeup, pgprocno, cvWaitLink);
!     SpinLockRelease(&cv->mutex);
!
!     for (;;)
!     {
!         bool        aminlist;
!         PGPROC       *proc = NULL;
!
!         /*
!          * Check if the sentinel is still there, and if so, remove the first
!          * queue entry (which might be the sentinel, or not).
!          */
!         SpinLockAcquire(&cv->mutex);
!         aminlist = proclist_contains(&cv->wakeup, pgprocno, cvWaitLink);
!         if (aminlist)
!             proc = proclist_pop_head_node(&cv->wakeup, cvWaitLink);
!         SpinLockRelease(&cv->mutex);
!
!         /* If we or anyone else removed the sentinel, we're done. */
!         if (!aminlist || proc == MyProc)
!             break;
!
!         /* We found someone sleeping, so set their latch to wake them up. */
!         SetLatch(&proc->procLatch);
          ++nwoken;
+     }

      return nwoken;
  }

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Stephen Frost
Дата:
Сообщение: Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means
Следующее
От: Haribabu Kommi
Дата:
Сообщение: Re: [HACKERS] Refactor handling of database attributes betweenpg_dump and pg_dumpall