Обсуждение: pid file problem

Поиск
Список
Период
Сортировка

pid file problem

От
"Peter Kovacs"
Дата:
Hi,

On system startup PostgreSQL 8.1.4 refuses to start due to the pid
file is left over from previous "session" on Solaris 10 x86. After
removing the old pid file, PG start up and creates a new pid file:

[root@sungift] /# ls -l /var/opt/postgresql/data/postmaster.pid
-rw-------   1 postgres dba           50 Feb 12 19:47
/var/opt/postgresql/data/postmaster.pid

In the logfile I find:

LOCATION:  SocketBackend, postgres.c:295
LOG:  00000: received smart shutdown request
LOCATION:  pmdie, postmaster.c:1885
LOG:  00000: shutting down
LOCATION:  ShutdownXLOG, xlog.c:5031
FATAL:  58P01: could not remove old lock file "postmaster.pid": No
such file or directory
HINT:  The file seems accidentally left over, but it could not be
removed. Please remove the file by hand and try again.

The start and stop entries for SVC look like the following:

<exec_method
                type='method'
                name='start'
                exec='/opt/postgresql/8.1.4/bin/pg_ctl start -l
/var/opt/postgresql/log/logfile'
                timeout_seconds='60'>
                <method_context>
                    <method_credential user='postgres' group='dba' />
                    <method_environment>
                        <envvar name="TMPDIR" value="/tmp"/>
                        <envvar name="PATH"
value="/opt/postgresql/8.1.4/bin:/usr/bin:/usr/ucb:/etc:.:/usr/sfw/bin:/usr/local/bin:/usr/ccs/bin"/>
                        <envvar name="LD_LIBRARY_PATH"
value="/opt/postgresql/8.1.4/lib"/>
                        <envvar name="PGDATA" value="/var/opt/postgresql/data"/>
                    </method_environment>
                </method_context>
        </exec_method>

        <exec_method
                type='method'
                name='stop'
                exec='/opt/postgresql/8.1.4/bin/pg_ctl stop'
                timeout_seconds='60'>
                <method_context>
                    <method_credential user='postgres' group='dba' />
                    <method_environment>
                        <envvar name="TMPDIR" value="/tmp"/>
                        <envvar name="PATH"
value="/opt/postgresql/8.1.4/bin:/usr/bin:/usr/ucb:/etc:.:/usr/sfw/bin:/usr/local/bin:/usr/ccs/bin"/>
                        <envvar name="LD_LIBRARY_PATH"
value="/opt/postgresql/8.1.4/lib"/>
                        <envvar name="PGDATA" value="/var/opt/postgresql/data"/>
                    </method_environment>
                </method_context>
        </exec_method>

Please, could you tell me why the pid file is not deleted on shutdown?

Thanks
Peter

Re: pid file problem

От
Tom Lane
Дата:
"Peter Kovacs" <maxottovonstirlitz@gmail.com> writes:
> Please, could you tell me why the pid file is not deleted on shutdown?

It looks to me like the postmaster isn't being given enough time to
finish shutdown before being forcibly killed.  This is not great,
but in theory we should always be able to recover from that.  You might
want to look to see if you can't extend the SIGTERM-to-SIGKILL delay
though.

> On system startup PostgreSQL 8.1.4 refuses to start due to the pid
> file is left over from previous "session" on Solaris 10 x86.

That should not happen; it should always be possible to detect whether
the file is stale, *if* your start script is written correctly.  Per
the comment in miscinit.c:

         * If the PID in the lockfile is our own PID or our parent's PID, then
         * the file must be stale (probably left over from a previous system
         * boot cycle).  We need this test because of the likelihood that a
         * reboot will assign exactly the same PID as we had in the previous
         * reboot.    Also, if there is just one more process launch in this
         * reboot than in the previous one, the lockfile might mention our
         * parent's PID.  We can reject that since we'd never be launched
         * directly by a competing postmaster.    We can't detect grandparent
         * processes unfortunately, but if the init script is written
         * carefully then all but the immediate parent shell will be
         * root-owned processes and so the kill test will fail with EPERM.
         *
         * We can treat the EPERM-error case as okay because that error
         * implies that the existing process has a different userid than we
         * do, which means it cannot be a competing postmaster.

If the start script is written in a way that creates multiple levels of
postgres-owned processes, you should fix it.  On Linux something
like this works:

    su -l postgres -c "/usr/bin/postmaster ... &" >> "$PGLOG" 2>&1 < /dev/null

Don't use "su 'pg_ctl ...'" to start the postmaster, because it creates
exactly the hazard situation of an extra postgres-owned process.

Also, if you are trying to start multiple postmasters at boot, this
technique is insufficient unless you run each one under a distinct userid
(which is probably a good idea anyway on security grounds).

            regards, tom lane