Re: BUG #7643: Issuing a shutdown request while server startup leads to server hang

Поиск
Список
Период
Сортировка
От Hari Babu
Тема Re: BUG #7643: Issuing a shutdown request while server startup leads to server hang
Дата
Msg-id 005201cdc6d7$fb988f00$f2c9ad00$@kommi@huawei.com
обсуждение исходный текст
Ответ на Re: BUG #7643: Issuing a shutdown request while server startup leads to server hang  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: BUG #7643: Issuing a shutdown request while server startup leads to server hang  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
>haribabu.kommi@huawei.com writes:
> Problem Reproduction:
> 1. Add recovery.conf to the database directory.
> 2. Start the server
> 3. Issue the shutdown request
> and the shutdown request timing should be such that below server logs
should
> print.

> Log:

> ./postgres -D data -p 3335
> LOG:  database system was shut down in recovery at 2012-11-08 19:42:42 IST
> LOG:  entering standby mode
> LOG:  received fast shutdown request
> LOG:  consistent recovery state reached at 0/17D0700
> LOG:  record with zero length at 0/17D0700

> Problem reproduced in 9.3 head.

>After further investigation, I can't reproduce this and I don't believe
>your patch fixes it.  What happens when I try this is

>* postmaster gets SIGINT, sends SIGTERM to startup process

>* startup process exits with exit(1)

>* postmaster sees that as a startup crash and exits, per the first
>test in reaper()

>So the log trace I'm getting looks like

>LOG:  received fast shutdown request
>LOG:  startup process (PID 9772) exited with exit code 1
>LOG:  aborting startup due to startup process failure

>Now, transitioning to PM_WAIT_BACKENDS state earlier, as your patch
>proposes, might make the log look a bit nicer because the logic in
>reaper() wouldn't think the exit was a "crash".  But it's not going to
>have anything to do with whether the startup process exits on the signal
>or not.  What seems to have happened for you is that the startup process
>ignored the SIGTERM signal, but it's not at all obvious why.

>We're going to need more details about how to reproduce this.
>I speculate it might have something to do with the specific
>restore_command you're using.

The problem occurs only when active server is restarting by just adding a
recovery.conf file to the data directory.
No need of specifying any restore command. or the standby server restart
also can lead to this problem.

The startup process sends "PMSIGNAL_RECOVERY_STARTED" to postmaster only
incase of "InArchiveRecovery" flag is enabled.
The SIGINT signal should reach postmaster before the
"PMSIGNAL_RECOVERY_STARTED" sent by the startup process.

with the following code change in the startupXlog function, the issue can
reproduce very easily.

        if (InArchiveRecovery && IsUnderPostmaster)
        {
                PublishStartupProcessInformation();
                SetForwardFsyncRequests();
                kill (PostmasterPid, SIGINT);
                SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED);
                bgwriterLaunched = true;
        }

Please let me know if I miss anything.

Regards,
Hari babu.

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Craig Ringer
Дата:
Сообщение: Re: Prepared Statement Name Truncation
Следующее
От: yongchao.xu@newtouch.cn
Дата:
Сообщение: BUG #7676: pgSocketCheck dosen`t return