Re: Interrupts vs signals
От | Thomas Munro |
---|---|
Тема | Re: Interrupts vs signals |
Дата | |
Msg-id | CA+hUKGLseL7e7EFMZZECzCijjov76PcXEGYy2swXa8jfnuV=2A@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Interrupts vs signals (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Interrupts vs signals
(Robert Haas <robertmhaas@gmail.com>)
|
Список | pgsql-hackers |
On Thu, Oct 21, 2021 at 8:27 AM Andres Freund <andres@anarazel.de> wrote: > On 2021-10-21 07:55:54 +1300, Thomas Munro wrote: > > I wonder if we really need signals to implement interrupts. Given > > that they are non-preemptive/cooperative (work happens at the next > > CFI()), why not just use shared memory flags and latches? That skips > > a bunch of code, global variables and scary warnings about programming > > in signal handlers. > > Depending on how you implement them, one difference could be whether / when > "slow" system calls (recv, poll, etc) are interrupted. Hopefully by now all such waits are implemented with latch.c facilities? > Another is that that signal handling provides a memory barrier in the > receiving process. For things that rarely change (like most interrupts), it > can be more efficient to move the cost of that out-of-line, instead of > incurring them on every check. Agreed, but in this experiment I was trying out the idea that a memory barrier is not really needed at all, unless you're about to go to sleep. We already insert one of those before a latch wait. That is, if we see !set->latch->is_set, we do pg_memory_barrier() and check again, before sleeping, so your next CFI must see the flag. For computation loops (sort, hash, query execution, ...), I speculate that a relaxed read of memory is fine... you'll see the flag pretty soon, and you certainly won't be allowed to finish your computation and go to sleep. > One nice thing of putting the state variables into shared memory would be that > that'd allow to see the pending interrupts of other backends for debugging > purposes. +1 > > One idea is to convert those into "procsignals" too, for > > consistency. In the attached, they can be set (ie by the same > > backend) with ProcSignalRaise(), but it's possible that in future we > > might have a reason for another backend to set them too, so it seems > > like a good idea to have a single system, effectively merging the > > concepts of "procsignals" and "interrupts". > > This seems a bit confusing to me. For one, we need to have interrupts working > before we have a proc, IIRC. But leaving details like that aside, it just > seems a bit backwards to me. I'm on board with other backends directly setting > interrupt flags, but it seems to me that the procsignal stuff should be > "client" of the process-local interrupt infrastructure, rather than the other > way round. Hmm. Yeah, I see your point. But I can also think of some arguments for merging the concepts of local and shared interrupts; see below. In this new sketch, I tried doing it the other way around. That is, completely removing the concept of "ProcSignal", leaving only "Interrupts". Initially, MyPendingInterrupts points to something private, and once you're connected to shared memory it points to MyProc->pending_interrupts. > > +bool > > +ProcSignalAnyPending(void) > > +{ > > + volatile ProcSignalSlot *slot = MyProcSignalSlot; > > > > - if (CheckProcSignal(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN)) > > - RecoveryConflictInterrupt(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN); > > + /* XXX make this static inline? */ > > + /* XXX point to a dummy entry instead of using NULL to avoid a branch */ > > + return slot && slot->pss_signaled; > > +} > > ISTM it might be easier to make this stuff efficiently race-free if we made > this a count of pending operations. Hmm, with a unified interrupt system and a bitmap it's not necessary to have a separate flag/counter at all. > > @@ -3131,12 +3124,13 @@ ProcessInterrupts(void) > > /* OK to accept any interrupts now? */ > > if (InterruptHoldoffCount != 0 || CritSectionCount != 0) > > return; > > - InterruptPending = false; > > + ProcSignalClearAnyPending(); > > + > > + pg_read_barrier(); > > > > - if (ProcDiePending) > > + if (ProcSignalConsume(PROCSIG_DIE)) > > { > > I think making all of these checks into function calls isn't great. How about > making the set of pending signals a bitmask? That'd allow to efficiently check > a bunch of interrupts together and even where not, it'd just be a single test > of the mask, likely already in a register. +1. Some assorted notes: 1. Aside from doing interrupts in this new way, I also have the postmaster setting latches (!) instead of sending ad hoc SIGUSR1 here and there. My main reason for doing that was to be able to chase out all reasons to register a SIGUSR1 handler, so I could prove that check-world passes. I like it, though. But maybe it's really a separate topic. 2. I moved this stuff into interrupt.{h,c}. There is nothing left in procsignal.c except code relating to ProcSignalBarrier. I guess that thing could use another name, anyway. It's a ... SystemInterruptBarrier? 3. Child-level SIGINT and SIGTERM handlers probably aren't really necessary, either: maybe the sender could do InterruptSend(INTERRUPT_{DIE,QUERY_CANCEL}, pgprocno) instead? But perhaps people are attached to being able to send those signals from external programs directly to backends. 4. Like the above, a SIGALRM handler might need to do eg InterruptRaise(INTERRUPT_STATEMENT_TIMEOUT). That's a problem for systems using spinlocks (self-deadlock against user context in InterruptRaise()), so I'd need to come up with some flag protocol for dinosaurs to make that safe, OR revert to having these "local only" interrupts done with separate flags, as you were getting at earlier. 5. The reason I prefer to put currently "local only" interrupts into the same atomic system is that I speculate that ultimately all of the backend-level signal handlers won't be needed. They all fall into three categories: (1) could be replaced with these interrupts directly, (2) could be replaced by the new timer infrastructure that multithreaded postgres would need to have to deliver interrupts to the right recipients, (3) are quickdie and can be handled at the containing process level. Then the only signal handlers left are top level external ones. But perhaps you're right and I should try reintroducing separate local interrupts for now. I dunno, I like the simplicity of the unified system; if only it weren't for those spinlock-backed atomics.
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Masahiko SawadaДата:
Сообщение: Re: Parallel vacuum workers prevent the oldest xmin from advancing