Re: the s_lock_stuck on perform_spin_delay
| От | Andy Fan |
|---|---|
| Тема | Re: the s_lock_stuck on perform_spin_delay |
| Дата | |
| Msg-id | 87sf37mdqq.fsf@163.com обсуждение исходный текст |
| Ответ на | Re: the s_lock_stuck on perform_spin_delay (Robert Haas <robertmhaas@gmail.com>) |
| Ответы |
Re: the s_lock_stuck on perform_spin_delay
|
| Список | pgsql-hackers |
Hi,
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Jan 7, 2024 at 9:52 PM Andy Fan <zhihuifan1213@163.com> wrote:
>> > I think we should add cassert-only infrastructure tracking whether we
>> > currently hold spinlocks, are in a signal handler and perhaps a few other
>> > states. That'd allow us to add assertions like:
>> ..
>> > - no lwlocks or ... while in signal handlers
>>
>> I *wish* lwlocks should *not* be held while in signal handlers since it
>> inspired me for a direction of a low-frequency internal bug where a
>> backend acuqire a LWLock when it has acuqired it before. However when I
>> read more document and code, I am thinking this should not be a
>> problem.
>
> It's not safe to acquire an LWLock in a signal handler unless we know
> that the code that the signal handler is interrupting can't already be
> doing that. Consider this code from LWLockAcquire:
Thanks for the explaination! I can follow the sistuation you descirbe
here, then I found I asked a bad question because I didn't clarify what
"signal handlers" I was refering to, sorry about that!
In your statement, I guess you are talking about the signal handler from
Linux. However I *assumed* such handlers are doing pretty similar stuff
like set a 'GlobalVarialbe=true'. If my assumption was right, I think
that should not be take cared. For example:
spin_or_lwlock_acquire();
... (linux signal handler may be invovked here no matther what ... code is)
spin_or_lwlock_relase()
Since the linux signal hander are pretty simply, so it can come back to
'spin_or_lwlock_relase' anyway. (However my assumption may be wrong and
thanks for highlight this, and it is helpful for me to debug my internal
bug!)
The singler handler I was refering to is 'CHECK_FOR_INTERRUPTS', Based
on this, spin_lock and lwlock are acted pretty differently.
spin_lock_acuqire();
CHECK_FOR_INTERRUPT();
spin_lock_release();
Since CHECK_FOR_INTERRUPT usually goes to the ERROR system which makes it
is hard to go back to 'spin_lock_release()', then spin lock leaks! so
CHECK_FOR_INTERRUPT is the place I Assert *spin lock* should not be
handled in my patch. and I understood what Andres was talking about is
the same thing. (Of course I can add the "Assert no spin lock is held"
into every linux single handler as well).
Based on the above, I asked my question in my previous post, where I am
not sure if we should do the same('Assert no-lwlock should be held') for
*lwlock* in CHECK_FOR_INTERRUPT since lwlocks can be released no matter
where CHECK_FOR_INTERRUPT jump to.
--
Best Regards
Andy Fan
В списке pgsql-hackers по дате отправления: