Re: Logical replication launcher did not automatically restart when got SIGKILL

Поиск
Список
Период
Сортировка
От shveta malik
Тема Re: Logical replication launcher did not automatically restart when got SIGKILL
Дата
Msg-id CAJpy0uCq9bRCrK8eg6TXryQqNY+h3j61Xf11kXjH_V0rs2266Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Logical replication launcher did not automatically restart when got SIGKILL  (Fujii Masao <masao.fujii@gmail.com>)
Ответы Re: Logical replication launcher did not automatically restart when got SIGKILL
Список pgsql-hackers
On Thu, Jul 24, 2025 at 2:39 PM Fujii Masao <masao.fujii@gmail.com> wrote:
>
> On Thu, Jul 17, 2025 at 6:58 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jul 16, 2025 at 8:51 AM cca5507 <cca5507@qq.com> wrote:
> > >
> > > Hi,
> > >
> > > The v1-0002 in [1] will call ReportBackgroundWorkerExit() which will send SIGUSR1 to 'bgw_notify_pid', but it may
alreadyexit in HandleChildCrash(), is this ok? 
> > >
> >
> > Shall ReportBackgroundWorkerExit() be skipped for 'crashed' background worker?
> >
> > If we look at code prior to commit 28a520c0b77, there we were setting
> > 'rw_crashed_at' in CleanupBackgroundWorker() and then
> > HandleChildCrash() was resetting the pid and exiting with no
> > additional processing.
>
> It seems we don't need to set rw_crashed_at in crash cases,
> since it's always reset to 0 by ResetBackgroundWorkerCrashTimes()
> in restart-after-crash code.

Yes, that seems the case,

> So, the only additional step we need may be
> resetting rw_pid to 0.
>

I agree.

> Instead of modifying CleanupBackend() to do this, another option
> could be to reset rw_pid during restart-after-crash code, for example,
> inside ResetBackgroundWorkerCrashTimes(). Thought?
>

Sounds reasonable.
Thinking out loud, when cleaning up after a backend or background
worker crash, process_pm_child_exit() is invoked, which subsequently
calls both CleanupBackend() and HandleChildCrash(). After the cleanup
completes, process_pm_child_exit() calls PostmasterStateMachine() to
move to the next state. As part of that, PostmasterStateMachine()
invokes ResetBackgroundWorkerCrashTimes() (only in crash
scenarios/FatalError), to reset a few things. Since it also resets
rw_worker.bgw_notify_pid, it seems reasonable to reset the rw_pid as
well there.

thanks
Shveta



В списке pgsql-hackers по дате отправления: