Re: BUG #7643: Issuing a shutdown request while server startup leads to server hang
От | Hari Babu |
---|---|
Тема | Re: BUG #7643: Issuing a shutdown request while server startup leads to server hang |
Дата | |
Msg-id | 005201cdc6d7$fb988f00$f2c9ad00$@kommi@huawei.com обсуждение исходный текст |
Ответ на | Re: BUG #7643: Issuing a shutdown request while server startup leads to server hang (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #7643: Issuing a shutdown request while server startup leads to server hang
|
Список | pgsql-bugs |
>haribabu.kommi@huawei.com writes: > Problem Reproduction: > 1. Add recovery.conf to the database directory. > 2. Start the server > 3. Issue the shutdown request > and the shutdown request timing should be such that below server logs should > print. > Log: > ./postgres -D data -p 3335 > LOG: database system was shut down in recovery at 2012-11-08 19:42:42 IST > LOG: entering standby mode > LOG: received fast shutdown request > LOG: consistent recovery state reached at 0/17D0700 > LOG: record with zero length at 0/17D0700 > Problem reproduced in 9.3 head. >After further investigation, I can't reproduce this and I don't believe >your patch fixes it. What happens when I try this is >* postmaster gets SIGINT, sends SIGTERM to startup process >* startup process exits with exit(1) >* postmaster sees that as a startup crash and exits, per the first >test in reaper() >So the log trace I'm getting looks like >LOG: received fast shutdown request >LOG: startup process (PID 9772) exited with exit code 1 >LOG: aborting startup due to startup process failure >Now, transitioning to PM_WAIT_BACKENDS state earlier, as your patch >proposes, might make the log look a bit nicer because the logic in >reaper() wouldn't think the exit was a "crash". But it's not going to >have anything to do with whether the startup process exits on the signal >or not. What seems to have happened for you is that the startup process >ignored the SIGTERM signal, but it's not at all obvious why. >We're going to need more details about how to reproduce this. >I speculate it might have something to do with the specific >restore_command you're using. The problem occurs only when active server is restarting by just adding a recovery.conf file to the data directory. No need of specifying any restore command. or the standby server restart also can lead to this problem. The startup process sends "PMSIGNAL_RECOVERY_STARTED" to postmaster only incase of "InArchiveRecovery" flag is enabled. The SIGINT signal should reach postmaster before the "PMSIGNAL_RECOVERY_STARTED" sent by the startup process. with the following code change in the startupXlog function, the issue can reproduce very easily. if (InArchiveRecovery && IsUnderPostmaster) { PublishStartupProcessInformation(); SetForwardFsyncRequests(); kill (PostmasterPid, SIGINT); SendPostmasterSignal(PMSIGNAL_RECOVERY_STARTED); bgwriterLaunched = true; } Please let me know if I miss anything. Regards, Hari babu.
В списке pgsql-bugs по дате отправления: