Обсуждение: PostgreSQL 7.3.2 running as NT service under Windows XP not always clearing PID file on restart

Поиск
Список
Период
Сортировка
Jason,

    I'm noticing that my suggestion in the writeup I did--regarding
deleting the postmaster.pid file on startup--isn't working properly all
the time.  Took some investigating, but basically what I'm finding is
that sometimes, after a Windows shutdown/start or restart, for reasons
I'm not 100% clear on, the file

    /usr/share/postgresql/data/postmaster.pid

is left behind.  Needless to say, this is a problem, as postmaster will
fail to startup automatically if that file lingers.  And just as
important, does this indicate that PostgreSQL is not being given the
time it needs to clean house prior to reboot?

    My suggestion to add lines to /etc/rc.d/rc.sysinit were based on the
presumption that somehow Cygwin (via cygwin1.dll??) made sure this
script was run after a restart and before any Cygwin compiled apps
kicked in.  But this does not appear to be the case.

    I'm not sure, but I believe the fact that postmaster is configured via
cygrunsrv as an NT service may have something to do with it.  That is,
unless my hunch is wrong, postmaster in this kind of configuration acts
like any NT service.  But the init scripts aren't really executed until
someone fires up a BASH shell.  This, of course, won't work for having
PostgreSQL automatically fire up on a Windows restart.

[Further testing is showing that rc.init is not fired up at all on
restart, but rather rc.local is.  The only problem is that rc.local
doesn't seem to kick in until AFTER the NT services have been fired up,
making it useless for the purposes here.]

    I've spent the better part of my time trying to find a nice, clean,
simple way to delete a file on Windows startup (but prior to NT services
kicking in), and I'll be darned...it's a lot more difficult than I would
have imagined.  It seems Microsoft didn't really provide a nice clean
boot order mechanism like *nix where you can add commands where needed.
  I mean, in Linux, you have the bootstrap process, the kernel loading,
then init, which uses inittab, which then leads to the rc.* files.  If
you need to have commands executed at a certain point in the startup
procedure, just add the commands to the relevant portion of the bootup
process.

    But Windows appears not to have such functionality.  They have their
StartUp folder, but that's useless unless you only need something done
AFTER everything like NT services are fired up AND someone has logged
in.  But there are no obvious built-in mechanisms for running
scripts/commands/apps on startup or shutdown at various stages of the
processes.  At least none I have found so far.

    Any thoughts?  Also, just as importantly, does the fact a
postmaster.pid file exists indicate any issues with the whole
cygrunsrv/postmaster configuration?  For example, the Windows NT
Resource Kit always had a utility called 'srvany.exe', which one could
use to do what cygrunsrv is doing (making a non NT service act like
one), but they're basically 'wrappers' as it were.  When one shuts down
Windows, Windows turns around and sends out kill signals to the running
apps/services, and in the case of srvany.exe or cygrunsrv, they must
then turn around and shutdown the apps under their control.

    Is cygrunsrv replying to the Windows kill signal before postmaster has
fully shutdown?  I honestly don't know.  I know that if I manually do a
'net start postmaster' and 'net stop postmaster', PostgreSQL properly
creates and deletes the postmaster.pid file without incident.  What does
it indicate if I do a simple Windows restart when the postmaster.pid
file is still there on reboot?  And how can I be sure PostgreSQL has
properly shutdown (other than checking /var/log/postmaster.log...which
doesn't timestamp all its messages)?


Re: PostgreSQL 7.3.2 running as NT service under Windows XP not always

От
Frank Seesink
Дата:
    I still do not have a clear sense of whether there is a risk of
database corruption when Windows is shutdown/restarted and sends kill
signals to all the running processes like 'postmaster' (as mentioned
earlier, the existence of the 'postmaster.pid' file makes me a little
anxious).  However, I have at least found a hacked solution for myself
to delete the postmaster.pid file on startup.

    There does not appear to be any clean method in the Windows NT/2000/XP
operating systems to execute startup (or shutdown) scripts at specific
points in the process.  Whereas most *nix systems have a very clear
startup path, where you can introduce commands somewhere in the whole
bootstrap->kernel->init->inittab->rc.*->... sequence as
needed--including deleting PID files etc. prior to inetd services
starting up, for example--Windows does not seem to afford this.

    The bootup process in NT/2000/XP follows the usual bootstrapping
(ntldr), followed by the kernel, and then the loading of device drivers,
then services, and then finally executables registered in places like

    HKLM\Software\Microsoft\Windows\Current Version\Run

and then anything located in a user's StartUp folder (which, of course,
requires logging in first).

    But this is too late for our purposes.  And there appear to be no
obvious mechanisms in place to allow you to execute applications along
the way if needed.

    'postmaster' is effectively an NT service, so we need to delete any
existing postmaster.pid file PRIOR to NT services loading.  However, as
noted above, the sequence does not really introduce the ability to add
commands/executables until AFTER all device drivers and NT services are
loaded.

    I have tried everything from C:\autoexec.bat and C:\config.sys to the
ol' Windows 3.1 %SystemRoot%\system.ini and %SystemRoot%\win.ini trick
of using a load= or run= command to the lesser known
%SystemRoot%\Winstart.bat file, but all are ignored by Windows 2000/XP.

    End result:  the best I could come up with was creating another NT
service whose sole purpose in life was to run in the 'postgres' user
context and delete the postmaster.pid and \tmp\.s.PGSQL* lock files, and
then making the 'postmaster' service depend on that service.  This way,
'postmaster' won't start until AFTER the postmaster.pid file has been
deleted (if it exists).

    Please note a few caveats though.  Do not try creating a shell script
or .BATch file which you then setup as an NT service via cygrunsrv.  At
least when I tried, doing so led to the service firing up, executing the
script, and--and this next part is really important--SHUTTING DOWN
AGAIN.  The problem here is that if you have 'postmaster' depend on this
service, it will never fire up, as the startup sequence will go

    * system startup sequence
    * NT service to delete postmaster.pid service starts up,
      executes, and shuts down
    * 'postmaster' attempts to startup, but finding the above
      service shutdown, fails since it depends on this service.

My personal solution involved using an already owned copy of FireDaemon
(www.firedaemon.com)--so there was no out-of-pocket expense for me.
FireDaemon is basically a Windows utility similar to cygrunsrv, only
FireDaemon has much greater flexibility.  Specifically, I can have
FireDaemon launch a .BATch file, but not monitor the execution, so when
the .BATch file terminates, the FireDaemon service continues to 'run'.
This tricks 'postmaster' effectively into believing the service is still
running, so 'postmaster' fires up as it should once the
FireDaemon-created service has started (i.e., the postmaster.pid file
has been deleted).

In other words,

  1.    I created a simple .BATch file to delete the necessary files
    (note this is a .BATch file, not a Cygwin BASH shell script,
    so paths are set accordingly):

    __________________________________________________
    @echo off
    del c:\cygwin\tmp\.s.PGSQL*
    del c:\cygwin\usr\share\postgresql\data\postmaster.pid
    __________________________________________________

  2.    I then created an NT service via FireDaemon named
    'cygwin-start' which launched this .BATch file as a console
    app, running it hidden, etc.

  3.    Using the Cygwin BASH shell, I shutdown and removed the
    'postmaster' service, then rebuilt the service with a
    modified command to make 'postmaster' depend on my new
    FireDaemon-created service, as follows:

    __________________________________________________
    $ net stop postmaster
    $ cygrunsrv --remove postmaster
    $ cygrunsrv --install postmaster --path /usr/bin/postmaster --args "-D
/usr/share/postgresql/data -i" --dep ipc-daemon --dep cygwin-start
--termsig INT --user postgres --shutdown
    __________________________________________________


Voila!  It's not pretty, but it works.  And unlike shutdown
scripts--also not inherent in Windows NT/2000/XP--this handles the case
of power outages and sudden power failure where 'postmaster' doesn't
even get a chance to properly shutdown.  (As for the administrative view
of these situations, I'll leave that to the reader to determine their
needs.)

    Please note this still doesn't answer the question why the
postmaster.pid file is left behind sometimes on Windows
shutdown/restart.  But for those needing their Cygwin PostgreSQL
database up and running on startup without issue, this is just one
possible scenario.

P.S.    If I can figure out a way to use cygrunsrv to create a service
    that runs a script and then remains active (so 'postmaster'
    will load), I'll post here.  Thus far, howerver, basic attempts
    like creating a service that points to a shell script fire up,
    execute the script, and then immediately shutdown.  Possibly
    having a script that launches another shell that runs the script
    would work, as the shell would still be 'running'.  But haven't
    tried yet, so don't know.


Re: PostgreSQL 7.3.2 running as NT service under Windows XP

От
Jason Tishler
Дата:
Frank,

On Thu, May 22, 2003 at 04:44:30PM -0400, Frank Seesink wrote:
> I'm noticing that my suggestion in the writeup I did--regarding
> deleting the postmaster.pid file on startup--isn't working properly
> all the time.  Took some investigating, but basically what I'm finding
> is that sometimes, after a Windows shutdown/start or restart, for
> reasons I'm not 100% clear on, the file
>
>     /usr/share/postgresql/data/postmaster.pid
>
> is left behind.

Just a WAG, but does this happen when there are active connections to
the database?

Anyway, did you supply the following options to cygrunsrv when you
installed postmaster as a service:

    1. --termsig INT
    2. --shutdown

Note that cygrunsrv defaults to TERM which will cause postmaster to wait
for all connections to end before terminating.

FWIW, the above cygrunsrv options have guaranteed me clean shutdowns
under 2000.

> And just as important, does this indicate that PostgreSQL is not being
> given the time it needs to clean house prior to reboot?

Most likely yes.

> I've spent the better part of my time trying to find a nice, clean,
> simple way to delete a file on Windows startup (but prior to NT
> services kicking in), and I'll be darned...it's a lot more difficult
> than I would have imagined.

Why not wrap postmaster in a shell script, /usr/local/bin/postmaster.sh?

    #!/bin/sh
    rm -f /usr/share/postgresql/data/postmaster.pid
    /usr/bin/postmaster $*

And then install postmaster.sh as the "service".

> [snip]
>
> Is cygrunsrv replying to the Windows kill signal before postmaster
> has fully shutdown?  I honestly don't know.

I don't know either.

> I know that if I manually do a 'net start postmaster' and 'net stop
> postmaster', PostgreSQL properly creates and deletes the
> postmaster.pid file without incident.

The above is another indication that maybe your cygrunsrv shutdown
parameter are not correct.

> What does it indicate if I do a simple Windows restart when the
> postmaster.pid file is still there on reboot?

Sorry, I can't parse the above -- even after multiple readings.

> And how can I be sure PostgreSQL has properly shutdown (other than
> checking /var/log/postmaster.log...which doesn't timestamp all its
> messages)?

I can only recommend checking the log file.  Even without the
timestamps, you should be able to figure out if PostgreSQL shut and
started up cleanly.  You can always start with a fresh log file to
facilitate the analysis.

Jason

--
PGP/GPG Key: http://www.tishler.net/jason/pubkey.asc or key servers
Fingerprint: 7A73 1405 7F2B E669 C19D  8784 1AFD E4CC ECF4 8EF6