Re: Resetting crash time of background worker

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: Resetting crash time of background worker
Дата
Msg-id CA+TgmoYyV1Xf+D86KUev_buUrPFdY3UxePtiZ+ijSbQ-mwzoUA@mail.gmail.com
обсуждение исходный текст
Ответ на Resetting crash time of background worker  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Ответы Re: Resetting crash time of background worker  (Amit Khandekar <amitdkhan.pg@gmail.com>)
Список pgsql-hackers
On Tue, Mar 17, 2015 at 1:33 AM, Amit Khandekar <amitdkhan.pg@gmail.com> wrote:
> When the postmaster recovers from a backend or worker crash, it resets bg
> worker's crash time (rw->rw_crashed_at) so that the bgworker will
> immediately restart (ResetBackgroundWorkerCrashTimes).
>
> But resetting rw->rw_crashed_at to 0 means that we have lost the information
> that the bgworker had actuallly crashed. So later when postmaster tries to
> find any workers that should start (maybe_start_bgworker), it treats this
> worker as a new worker, as against treating it as one that had crashed and
> is to be restarted. So for this bgworker, it does not consider
> BGW_NEVER_RESTART :
>
> if (rw->rw_crashed_at != 0) { if (rw->rw_worker.bgw_restart_time ==
> BGW_NEVER_RESTART) { ForgetBackgroundWorker(&iter); continue; } .... ....
> That means, it will not remove the worker, and it will be restarted. Now if
> the worker again crashes, postmaster would keep on repeating the crash and
> restart cycle for the whole system.
>
> From what I understand, BGW_NEVER_RESTART applies even to a crashed server.
> But let me know if I am missing anything.
>
> I think we either have to retain the knowledge that the worker has crashed
> using some new field, or else, we should reset the crash time only if it is
> not flagged BGW_NEVER_RESTART.

I think you're right, and I think we should do the second of those.
Thanks for tracking this down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jackson Isaac
Дата:
Сообщение: GSoC 2015 Idea Discussion
Следующее
От: Robert Haas
Дата:
Сообщение: Re: assessing parallel-safety