Re: Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl
Дата
Msg-id 13467.1456076736@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Re: [COMMITTERS] pgsql: Introduce group locking to prevent parallel processes from deadl  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
I wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Yeah, you're right.  Attached is a draft patch that tries to clean up
>> that and a bunch of other things that you raised.

> I've been reviewing this patch, and I had a question: why do we need to
> bother with a lockGroupLeaderIdentifier field?  It seems totally redundant
> with the leader's pid field, ie, why doesn't BecomeLockGroupMember simply
> compare leader->pid with the PID it's passed?  For some more safety, it
> could also verify that leader->lockGroupLeader == leader; but I don't
> see what the extra field is buying us.

Here is a proposed patch that gets rid of lockGroupLeaderIdentifier.

I concluded that the "leader->lockGroupLeader == leader" test is
necessary, because we don't ever reset the pid field of a dead PGPROC
until it's re-used.  However, with that check, this seems isomorphic to
the existing code because lockGroupLeader is reset to NULL at all the same
places that lockGroupLeaderIdentifier is reset.  So we do not need both of
those fields.

            regards, tom lane

diff --git a/src/backend/storage/lmgr/README b/src/backend/storage/lmgr/README
index c399618..56b0a12 100644
*** a/src/backend/storage/lmgr/README
--- b/src/backend/storage/lmgr/README
*************** those cases so that they no longer use h
*** 650,676 ****
  (which is not a crazy idea, given that such lock acquisitions are not expected
  to deadlock and that heavyweight lock acquisition is fairly slow anyway).

! Group locking adds four new members to each PGPROC: lockGroupLeaderIdentifier,
! lockGroupLeader, lockGroupMembers, and lockGroupLink. The first is simply a
! safety mechanism. A newly started parallel worker has to try to join the
! leader's lock group, but it has no guarantee that the group leader is still
! alive by the time it gets started. We try to ensure that the parallel leader
! dies after all workers in normal cases, but also that the system could survive
! relatively intact if that somehow fails to happen. This is one of the
! precautions against such a scenario: the leader relays its PGPROC and also its
! PID to the worker, and the worker fails to join the lock group unless the
! given PGPROC still has the same PID. We assume that PIDs are not recycled
! quickly enough for this interlock to fail.
!
! A PGPROC's lockGroupLeader is NULL for processes not involved in parallel
! query. When a process wants to cooperate with parallel workers, it becomes a
! lock group leader, which means setting this field to point to its own
! PGPROC. When a parallel worker starts up, it points this field at the leader,
! with the above-mentioned interlock. The lockGroupMembers field is only used in
  the leader; it is a list of the member PGPROCs of the lock group (the leader
  and all workers). The lockGroupLink field is the list link for this list.

! All four of these fields are considered to be protected by a lock manager
  partition lock.  The partition lock that protects these fields within a given
  lock group is chosen by taking the leader's pgprocno modulo the number of lock
  manager partitions.  This unusual arrangement has a major advantage: the
--- 650,665 ----
  (which is not a crazy idea, given that such lock acquisitions are not expected
  to deadlock and that heavyweight lock acquisition is fairly slow anyway).

! Group locking adds three new members to each PGPROC: lockGroupLeader,
! lockGroupMembers, and lockGroupLink. A PGPROC's lockGroupLeader is NULL for
! processes not involved in parallel query. When a process wants to cooperate
! with parallel workers, it becomes a lock group leader, which means setting
! this field to point to its own PGPROC. When a parallel worker starts up, it
! points this field at the leader. The lockGroupMembers field is only used in
  the leader; it is a list of the member PGPROCs of the lock group (the leader
  and all workers). The lockGroupLink field is the list link for this list.

! All three of these fields are considered to be protected by a lock manager
  partition lock.  The partition lock that protects these fields within a given
  lock group is chosen by taking the leader's pgprocno modulo the number of lock
  manager partitions.  This unusual arrangement has a major advantage: the
*************** change while the deadlock detector is ru
*** 679,684 ****
--- 668,685 ----
  all the lock manager locks.  Also, holding this single lock allows safe
  manipulation of the lockGroupMembers list for the lock group.

+ We need an additional interlock when setting these fields, because a newly
+ started parallel worker has to try to join the leader's lock group, but it
+ has no guarantee that the group leader is still alive by the time it gets
+ started.  We try to ensure that the parallel leader dies after all workers
+ in normal cases, but also that the system could survive relatively intact
+ if that somehow fails to happen.  This is one of the precautions against
+ such a scenario: the leader relays its PGPROC and also its PID to the
+ worker, and the worker fails to join the lock group unless the given PGPROC
+ still has the same PID and is still a lock group leader.  We assume that
+ PIDs are not recycled quickly enough for this interlock to fail.
+
+
  User Locks (Advisory Locks)
  ---------------------------

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 114fba0..00787d4 100644
*** a/src/backend/storage/lmgr/proc.c
--- b/src/backend/storage/lmgr/proc.c
*************** InitProcess(void)
*** 401,407 ****
      pg_atomic_init_u32(&MyProc->procArrayGroupNext, INVALID_PGPROCNO);

      /* Check that group locking fields are in a proper initial state. */
-     Assert(MyProc->lockGroupLeaderIdentifier == 0);
      Assert(MyProc->lockGroupLeader == NULL);
      Assert(dlist_is_empty(&MyProc->lockGroupMembers));

--- 401,406 ----
*************** InitAuxiliaryProcess(void)
*** 565,571 ****
      SwitchToSharedLatch();

      /* Check that group locking fields are in a proper initial state. */
-     Assert(MyProc->lockGroupLeaderIdentifier == 0);
      Assert(MyProc->lockGroupLeader == NULL);
      Assert(dlist_is_empty(&MyProc->lockGroupMembers));

--- 564,569 ----
*************** ProcKill(int code, Datum arg)
*** 822,828 ****
          dlist_delete(&MyProc->lockGroupLink);
          if (dlist_is_empty(&leader->lockGroupMembers))
          {
-             leader->lockGroupLeaderIdentifier = 0;
              leader->lockGroupLeader = NULL;
              if (leader != MyProc)
              {
--- 820,825 ----
*************** BecomeLockGroupLeader(void)
*** 1771,1777 ****
      leader_lwlock = LockHashPartitionLockByProc(MyProc);
      LWLockAcquire(leader_lwlock, LW_EXCLUSIVE);
      MyProc->lockGroupLeader = MyProc;
-     MyProc->lockGroupLeaderIdentifier = MyProcPid;
      dlist_push_head(&MyProc->lockGroupMembers, &MyProc->lockGroupLink);
      LWLockRelease(leader_lwlock);
  }
--- 1768,1773 ----
*************** BecomeLockGroupMember(PGPROC *leader, in
*** 1795,1800 ****
--- 1791,1799 ----
      /* Group leader can't become member of group */
      Assert(MyProc != leader);

+     /* Can't already be a member of a group */
+     Assert(MyProc->lockGroupLeader == NULL);
+
      /* PID must be valid. */
      Assert(pid != 0);

*************** BecomeLockGroupMember(PGPROC *leader, in
*** 1809,1815 ****
      LWLockAcquire(leader_lwlock, LW_EXCLUSIVE);

      /* Try to join the group */
!     if (leader->lockGroupLeaderIdentifier == pid)
      {
          ok = true;
          MyProc->lockGroupLeader = leader;
--- 1808,1814 ----
      LWLockAcquire(leader_lwlock, LW_EXCLUSIVE);

      /* Try to join the group */
!     if (leader->pid == pid && leader->lockGroupLeader == leader)
      {
          ok = true;
          MyProc->lockGroupLeader = leader;
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index a9405ce..dbcdd3f 100644
*** a/src/include/storage/proc.h
--- b/src/include/storage/proc.h
*************** struct PGPROC
*** 166,172 ****
       * Support for lock groups.  Use LockHashPartitionLockByProc on the group
       * leader to get the LWLock protecting these fields.
       */
-     int            lockGroupLeaderIdentifier;    /* MyProcPid, if I'm a leader */
      PGPROC       *lockGroupLeader;    /* lock group leader, if I'm a member */
      dlist_head    lockGroupMembers;    /* list of members, if I'm a leader */
      dlist_node  lockGroupLink;        /* my member link, if I'm a member */
--- 166,171 ----

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: proposal: enhancing slow query log, and autoexplain log about waiting on lock before query exec time
Следующее
От: Tom Lane
Дата:
Сообщение: Re: a raft of parallelism-related bug fixes