Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!)
От | Simon Riggs |
---|---|
Тема | Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!) |
Дата | |
Msg-id | 1283705909.1834.6273.camel@ebony обсуждение исходный текст |
Ответ на | Re: Interruptible sleeps (was Re: CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!) (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On Fri, 2010-09-03 at 18:24 -0400, Tom Lane wrote: > Now the HS case likewise appears to be set up so that the signal can > only directly interrupt ProcWaitForSignal, so I think the core issue > is > whether any deadlock situations are possible. Given that this gets > called from a low-level place like LockBufferForCleanup, I don't feel > too comfortable about that. LockBufferForCleanup is only ever called during recovery by heap_xlog_clean() or btree_xlog_vacuum(). The actions taken to replay a WAL record are independent of all other WAL records from a locking perspective, so replay of every WAL record starts with no LWlocks held by startup process. LockBufferForCleanup is taken early on in replay a heap or btree cleanup record and so we can easily check that no other LWlocks are held while it is called. > I certainly haven't seen any analysis or > documentation of what locks can safely be held at that point. > The deadlock checker only tries to take the LockMgr LWLocks, so > extrapolating from whether it is safe to whether touching the > ProcArrayLock is safe seems entirely unfounded. So the startup process calls one LWlock, ProcArrayLock, and is not holding any other LWlock when it does. The deadlock checker attempts to get and hold all of the other lock partition locks. So deadlock checker already does the thing you're saying might be dangerous and the startup process doesn't. The ProcArrayLock is only taken as a way of signaling other backends. If that is particularly unsafe we could redesign that aspect. > It might be worth pointing out here that LockBufferForCleanup is > already > known to be a risk factor for undetected deadlocks, even without HS in > the picture, because of the possibility of deadlocks involving a chain > of both heavyweight locks and LWLocks. Whether HS makes it materially > worse may be something that we need field experience to determine. You may be right and that it will be a problem. The deadlock risk we're protecting against is a deadlock involving both normal locks and buffer pins. We're safer having it than not having this code, IMHO. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
В списке pgsql-hackers по дате отправления: