Re: LWLock deadlock and gdb advice

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: LWLock deadlock and gdb advice
Дата
Msg-id 55B90C0C.6020104@iki.fi
обсуждение исходный текст
Ответ на Re: LWLock deadlock and gdb advice  (Andres Freund <andres@anarazel.de>)
Ответы Re: LWLock deadlock and gdb advice  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On 07/29/2015 04:10 PM, Andres Freund wrote:
> On 2015-07-29 14:22:23 +0200, Andres Freund wrote:
>> On 2015-07-29 15:14:23 +0300, Heikki Linnakangas wrote:
>>> Ah, ok, that should work, as long as you also re-check the variable's value
>>> after queueing. Want to write the patch, or should I?
>>
>> I'll try. Shouldn't be too hard.
>
> What do you think about something roughly like the attached?

Hmm. Imagine this:

Backend A has called LWLockWaitForVar(X) on a lock, and is now waiting 
on it. The lock holder releases the lock, and wakes up A. But before A 
wakes up and sees that the lock is free, another backend acquires the 
lock again. It runs LWLockAcquireWithVar to the point just before 
setting the variable's value. Now A wakes up, sees that the lock is 
still (or again) held, and that the variable's value still matches the 
old one, and goes back to sleep. The new lock holder won't wake it up 
until it updates the value again, or to releases the lock.

I think that's OK given the current usage of this in WALInsertLocks, but 
it's scary. At the very least you need comments to explain that race 
condition. You didn't like the new LW_FLAG_VAR_SET flag and the API 
changes I proposed? That eliminates the race condition.

Either way, there is a race condition that if the new lock holder sets 
the variable to the *same* value as before, the old waiter won't 
necessarily wake up even though the lock was free or had a different 
value in between. But that's easier to explain and understand than the 
fact that the value set by LWLockAcquireWithVar() might not be seen by a 
waiter, until you release the lock or update it again.

Another idea is to have LWLockAcquire check the wait queue after setting 
the variable, and wake up anyone who's already queued up. You could just 
call LWLockUpdateVar() from LWLockAcquireCommon to set the variable. I 
would prefer the approach from my previous patch more, though. That 
would avoid having to acquire the spinlock in LWLockAcquire altogether, 
and I actually like the API change from that independently from any 
correctness issues.

The changes in LWLockWaitForVar() and LWLockConflictsWithVar() seem OK 
in principle. You'll want to change LWLockConflictsWithVar() so that the 
spinlock is held only over the "value = *valptr" line, though; the other 
stuff just modifies local variables and don't need to be protected by 
the spinlock. Also, when you enter LWLockWaitForVar(), you're checking 
if the lock is held twice in a row; first at the top of the function, 
and again inside LWLockConflictsWithVar. You could just remove the 
"quick test" at the top.

- Heikki




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: Reduce ProcArrayLock contention
Следующее
От: Robert Haas
Дата:
Сообщение: Re: upgrade failure from 9.5 to head