Re: the s_lock_stuck on perform_spin_delay

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: the s_lock_stuck on perform_spin_delay
Дата
Msg-id 87cyttwbin.fsf@163.com
обсуждение исходный текст
Ответ на Re: the s_lock_stuck on perform_spin_delay  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: the s_lock_stuck on perform_spin_delay  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:

> On Mon, Jan 22, 2024 at 2:22 AM Andy Fan <zhihuifan1213@163.com> wrote:
>> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
>> quickdie, however this is bad not only because it doesn't work on
>> Windows, but also it has too poor performance even it impacts on
>> USE_ASSERT_CHECKING build only. In v8, I introduced a new global
>> variable quickDieInProgress to handle this.
>
> OK, I like the split between 0001 and 0002. I still think 0001 has
> cosmetic problems, but if some committer wants to take it forward,
> they can decide what to do about that; you and I going back and forth
> doesn't seem like the right approach to sorting that out. Whether or
> not 0002 is adopted might affect what we do about the cosmetics in
> 0001, too.

Replacing ASSERT_NO_SPIN_LOCK with VerifyNoSpinLocksHeld or adding more
comments to each call of VerifyNoSpinLocksHeld really makes me happy
since they make things more cosmetic and practical. So I'd be absolutely
willing to do more stuff like this. Thanks for such suggestions!

Then I can't understand the left cosmetic problems... since you are
saying it may related to 0002, I guess you are talking about the naming
of START_SPIN_LOCK and END_SPIN_LOCK?

>
> 0003 seems ... unfortunate. It seems like an admission that 0001 is
> wrong.

Yes, that's what I was thinking. I doubted if I should merge 0003 to
0001 directly during this discussion, and finally I made it separate for
easier dicussion.

> Surely it *isn't* right to ignore the spinlock restrictions in
> quickdie() in general. For example, we could self-deadlock if we try
> to acquire a spinlock we already hold. If the problem here is merely
> the call in errstart() then maybe we need to rethink that particular
> call. If it goes any deeper than that, maybe we've got actual bugs we
> need to fix.

I get your point! Acquiring an already held spinlock in quickdie is
unlikely to happen, but since our existing infrastructure can handle it,
then there is no reason to bypass it. Since the problem here is just
errstart, we can do a if(!quickDieInProgress) VerifyNoSpinLocksHeld();
in errstart only. Another place besides the errstart is the
CHECK_FOR_INTERRUPTS in errfinish. I think we can add the same check for
the VerifyNoSpinLocksHeld in CHECK_FOR_INTERRUPTS.

>
> +        * It's likely to check the BlockSig to know if it is doing a quickdie
> +        * with sigismember, but it is too expensive in test, so introduce
> +        * quickDieInProgress to avoid that.
>
> This isn't very good English -- I realize that can sometimes be hard
> -- but also -- I don't think it likely that a future hacker would
> wonder why this isn't done that way. A static variable is normal for
> PostgreSQL; checking the signal mask would be a completely novel
> approach. So I think this comment is missing the mark topically.

I was wrong to think sigismember is a syscall, but now I see it is just
a function in glibc. Even I can't get the source code of it, I think it
should just be some bit-operation based on the definition of __sigset_t.

Another badness of sigismember is it is not avaiable in windows. It is
still unclear to me why sigaddset in quickdie can
compile in windows.  (I have a sigismember version and get a linker
error at windows). This should be a blocker for me to submit the next
version of patch.

> If
> this patch is right at all, the comment here should focus on why
> disabling these checks in quickdie() is necessary and appropriate, not
> why it's coded to match everything else in the system.

I agree, and I think the patch 0003 is not right at all:(

--
Best Regards
Andy Fan




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: psql: Allow editing query results with \gedit
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: remaining sql/json patches