Обсуждение: Found this in the server log on MAC OSX

Поиск
Список
Период
Сортировка

Found this in the server log on MAC OSX

От
Sean Shanny
Дата:
To all,

I restarted postgres this morning to make a configuration file change
take effect and when I looked at the serverlog to make sure things were
ok I found the prior shutdown message(*in bold*).  This shut down was
not initiated by me or anyone else who has access to the machine.  Any
ideas as to what might be causing this?

OSX Server 10.3.2 running on a dual 2.ghz G5 with 8GB memory attached to
a XRaid 3.5TB system via Fibre channel.  We are running 7.4.1.  The
LOG:  unexpected EOF on client connection  messages are from our JDBC
implementation, they are benign.

Thanks.

--sean

LOG:  unexpected EOF on client connection
*LOG:  received smart shutdown request *
*FATAL:  the database system is shutting down
FATAL:  the database system is shutting down
LOG:  server process (PID 4691) was terminated by signal 9
LOG:  terminating any other active server processes
LOG:  statistics collector process (PID 361) was terminated by signal 9
LOG:  all server processes terminated; reinitializing
LOG:  database system was interrupted at 2004-02-23 07:50:15 EST
LOG:  checkpoint record is at D0/B2FD5AB4
LOG:  redo record is at D0/B2FD5AB4; undo record is at 0/0; shutdown FALSE
LOG:  next transaction ID: 31473487; next OID: 3103140043
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  record with zero length at D0/B2FD5AF4
LOG:  redo is not required
LOG:  database system is ready*
LOG:  shutting down
LOG:  database system is shut down
LOG:  database system was shut down at 2004-02-23 08:12:28 EST
LOG:  checkpoint record is at D0/B2FD5B34
LOG:  redo record is at D0/B2FD5B34; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 31473487; next OID: 3103140043
LOG:  database system is ready


Re: Found this in the server log on MAC OSX

От
"Ed L."
Дата:
On Monday February 23 2004 8:43, Sean Shanny wrote:
>
> I restarted postgres this morning to make a configuration file change
> take effect and when I looked at the serverlog to make sure things were
> ok I found the prior shutdown message(*in bold*).  This shut down was
> not initiated by me or anyone else who has access to the machine.  Any
> ideas as to what might be causing this?
>
> LOG:  unexpected EOF on client connection
> *LOG:  received smart shutdown request *
> *FATAL:  the database system is shutting down
> FATAL:  the database system is shutting down
> LOG:  server process (PID 4691) was terminated by signal 9
> LOG:  terminating any other active server processes
> LOG:  statistics collector process (PID 361) was terminated by signal 9

Looks like it got a SIGTERM, which is what you might get if someone shutdown
OSX (osx pls gurus correct me).  Maybe your machine rebooted?   Logging
pids and timestamps might help tell the tale next time ...



Re: Found this in the server log on MAC OSX

От
Sean Shanny
Дата:
Ed,

If I continue to see the behavior I will turn the logging stuff on as
you suggested.  The machine uptime is show 3+ days so it did not reboot.

Thanks.

--sean

Ed L. wrote:

>On Monday February 23 2004 8:43, Sean Shanny wrote:
>
>
>>I restarted postgres this morning to make a configuration file change
>>take effect and when I looked at the serverlog to make sure things were
>>ok I found the prior shutdown message(*in bold*).  This shut down was
>>not initiated by me or anyone else who has access to the machine.  Any
>>ideas as to what might be causing this?
>>
>>LOG:  unexpected EOF on client connection
>>*LOG:  received smart shutdown request *
>>*FATAL:  the database system is shutting down
>>FATAL:  the database system is shutting down
>>LOG:  server process (PID 4691) was terminated by signal 9
>>LOG:  terminating any other active server processes
>>LOG:  statistics collector process (PID 361) was terminated by signal 9
>>
>>
>
>Looks like it got a SIGTERM, which is what you might get if someone shutdown
>OSX (osx pls gurus correct me).  Maybe your machine rebooted?   Logging
>pids and timestamps might help tell the tale next time ...
>
>
>
>
>

Re: Found this in the server log on MAC OSX

От
Tom Lane
Дата:
"Ed L." <pgsql@bluepolka.net> writes:
> On Monday February 23 2004 8:43, Sean Shanny wrote:
>> *LOG:  received smart shutdown request *
>> *FATAL:  the database system is shutting down
>> FATAL:  the database system is shutting down
>> LOG:  server process (PID 4691) was terminated by signal 9
>> LOG:  terminating any other active server processes
>> LOG:  statistics collector process (PID 361) was terminated by signal 9

> Looks like it got a SIGTERM, which is what you might get if someone shutdown
> OSX (osx pls gurus correct me).

Uh, no, signal 9 is SIGKILL not SIGTERM.  I'm not aware of any automatic
mechanism in OS X that would issue SIGKILL against a Postgres backend.
Certainly Postgres itself would not.  Some Linux kernels issue SIGKILL
to get out of out-of-memory situations, but I believe OS X to be better
behaved than that.

My private opinion is that Sean is mistaken and that the above trace
shows someone manually (and not very competently) shutting down the
database.  First they tried a SIGTERM against the postmaster (or
equivalently "pg_ctl stop") and after getting tired of waiting for
the clients to shut down, they did SIGKILL against the backends.
It's too bad we have no timestamps in this log, as the intervals
between the above-recorded entries would be very revealing.

            regards, tom lane

Re: Found this in the server log on MAC OSX

От
"Ed L."
Дата:
On Monday February 23 2004 10:50, Tom Lane wrote:
> "Ed L." <pgsql@bluepolka.net> writes:
> > On Monday February 23 2004 8:43, Sean Shanny wrote:
> >> *LOG:  received smart shutdown request *
> >> *FATAL:  the database system is shutting down
> >> FATAL:  the database system is shutting down
> >> LOG:  server process (PID 4691) was terminated by signal 9
> >> LOG:  terminating any other active server processes
> >> LOG:  statistics collector process (PID 361) was terminated by signal
> >> 9
> >
> > Looks like it got a SIGTERM, which is what you might get if someone
> > shutdown OSX (osx pls gurus correct me).
>
> Uh, no, signal 9 is SIGKILL not SIGTERM.  I'm not aware of any automatic
> mechanism in OS X that would issue SIGKILL against a Postgres backend.
> Certainly Postgres itself would not.  Some Linux kernels issue SIGKILL
> to get out of out-of-memory situations, but I believe OS X to be better
> behaved than that.

Uh, no, I didn't say signal 9 is SIGTERM.  Isn't a "smart" shutdown request
an indication of a SIGTERM?  I'm just speculating about what happened, but
isn't that what you'd see during a system shutdown?  The kernel sending
SIGTERMs?



Re: Found this in the server log on MAC OSX

От
Sean Shanny
Дата:
To all,

It looks like it is the fault of the shutdown script <said Sean with
sheepish look on my face>

We have to take a look a the scripts, found them on the web, to figure
out what they are doing to cause this.

Sorry for the clutter in your mailboxes.

--sean


waiting for postmaster to shut
down................................................................ failed
pg_ctl: postmaster does not shut down
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: ??: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 3:35.75: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres::
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill:
192.168.34.19: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: idle: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: ??: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 0:03.56: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres::
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill:
192.168.35.16: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: idle: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: in: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: transaction:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: std-: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 0:00.71: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres::
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: stats: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: buffer: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: process: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: std-: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 0:03.78: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres::
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: stats: no
such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: collector:
no such pid
/Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: process: no
such pid



Tom Lane wrote:

>"Ed L." <pgsql@bluepolka.net> writes:
>
>
>>On Monday February 23 2004 8:43, Sean Shanny wrote:
>>
>>
>>>*LOG:  received smart shutdown request *
>>>*FATAL:  the database system is shutting down
>>>FATAL:  the database system is shutting down
>>>LOG:  server process (PID 4691) was terminated by signal 9
>>>LOG:  terminating any other active server processes
>>>LOG:  statistics collector process (PID 361) was terminated by signal 9
>>>
>>>
>
>
>
>>Looks like it got a SIGTERM, which is what you might get if someone shutdown
>>OSX (osx pls gurus correct me).
>>
>>
>
>Uh, no, signal 9 is SIGKILL not SIGTERM.  I'm not aware of any automatic
>mechanism in OS X that would issue SIGKILL against a Postgres backend.
>Certainly Postgres itself would not.  Some Linux kernels issue SIGKILL
>to get out of out-of-memory situations, but I believe OS X to be better
>behaved than that.
>
>My private opinion is that Sean is mistaken and that the above trace
>shows someone manually (and not very competently) shutting down the
>database.  First they tried a SIGTERM against the postmaster (or
>equivalently "pg_ctl stop") and after getting tired of waiting for
>the clients to shut down, they did SIGKILL against the backends.
>It's too bad we have no timestamps in this log, as the intervals
>between the above-recorded entries would be very revealing.
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 3: if posting/reading through Usenet, please send an appropriate
>      subscribe-nomail command to majordomo@postgresql.org so that your
>      message can get through to the mailing list cleanly
>
>
>

Re: Found this in the server log on MAC OSX

От
Tom Lane
Дата:
"Ed L." <pgsql@bluepolka.net> writes:
> Uh, no, I didn't say signal 9 is SIGTERM.  Isn't a "smart" shutdown request
> an indication of a SIGTERM?  I'm just speculating about what happened, but
> isn't that what you'd see during a system shutdown?  The kernel sending
> SIGTERMs?

Yes, the trace is sort of consistent with the idea of a system shutdown:
you'd see SIGTERMs issued, followed some time later by SIGKILL.
I thought Sean had said that the machine did not shut down during this
interval, and so mentally eliminated that theory --- but based on his
latest comment I guess that is what happened after all.

So that does leave me with a question: why didn't it work more cleanly?
Our signal responses are designed around the assumption that during
shutdown the kernel will send SIGTERM to *all* the Postgres processes.
Backends interpret that as an immediate shutdown and should exit quickly
enough to avoid getting SIGKILL'd later.  It looks like either the
postmaster was sent SIGTERM but the backends weren't, or the interval
between SIGTERM and SIGKILL was unreasonably short.  I don't think I
believe the latter; the last time I checked this on Darwin, it seemed to
be using the traditional 20-second grace period.

Another question: if that was a shutdown we were looking at, how did the
postmaster live long enough to record the final log lines?  It shoulda
gotten SIGKILL'd at the same time as its children.

In short, there's something pretty odd about the way these signals are
being passed around.  It looks something like a standard system shutdown
sequence, but not enough like it.

            regards, tom lane

Re: Found this in the server log on MAC OSX

От
Sean Shanny
Дата:
Tom,

Let me clarify.... I was meant shutdown in the context of issuing a stop
against postgres not shutting down the OS.  Sorry if I am confusing things.

The scripts we are using to issue start, stop etc for postgres seem to
be causing the issue.  I changed the config to use timestamps in the log
and the act of stopping and starting the server caused the same error to
occur. :-(

 From the scripts we are using:

StartService ()
{
        if [ "${POSTGRES:=-YES-}" = "-YES-" ]; then

            ConsoleMessage "Starting PostgreSQL database server"
            su - postgres -c '/usr/local/pgsql/bin/pg_ctl start -D
/usr/local/pgsql/data -l /usr/local/pgsql/data/logfile -o -i'

        fi
}

StopService()
{
        ConsoleMessage "Stopping PostgreSQL database services"
        /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data
        x=`/bin/ps axc | /usr/bin/grep postgres`
        if /bin/test "$x"
        then
                set $x
                kill -9 $x
        fi
}


Thanks.

--sean


Tom Lane wrote:

>"Ed L." <pgsql@bluepolka.net> writes:
>
>
>>Uh, no, I didn't say signal 9 is SIGTERM.  Isn't a "smart" shutdown request
>>an indication of a SIGTERM?  I'm just speculating about what happened, but
>>isn't that what you'd see during a system shutdown?  The kernel sending
>>SIGTERMs?
>>
>>
>
>Yes, the trace is sort of consistent with the idea of a system shutdown:
>you'd see SIGTERMs issued, followed some time later by SIGKILL.
>I thought Sean had said that the machine did not shut down during this
>interval, and so mentally eliminated that theory --- but based on his
>latest comment I guess that is what happened after all.
>
>So that does leave me with a question: why didn't it work more cleanly?
>Our signal responses are designed around the assumption that during
>shutdown the kernel will send SIGTERM to *all* the Postgres processes.
>Backends interpret that as an immediate shutdown and should exit quickly
>enough to avoid getting SIGKILL'd later.  It looks like either the
>postmaster was sent SIGTERM but the backends weren't, or the interval
>between SIGTERM and SIGKILL was unreasonably short.  I don't think I
>believe the latter; the last time I checked this on Darwin, it seemed to
>be using the traditional 20-second grace period.
>
>Another question: if that was a shutdown we were looking at, how did the
>postmaster live long enough to record the final log lines?  It shoulda
>gotten SIGKILL'd at the same time as its children.
>
>In short, there's something pretty odd about the way these signals are
>being passed around.  It looks something like a standard system shutdown
>sequence, but not enough like it.
>
>            regards, tom lane
>
>
>

Re: Found this in the server log on MAC OSX

От
Tom Lane
Дата:
Sean Shanny <shannyconsulting@earthlink.net> writes:
> StopService()
> {
>         ConsoleMessage "Stopping PostgreSQL database services"
>         /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data
>         x=`/bin/ps axc | /usr/bin/grep postgres`
>         if /bin/test "$x"
>         then
>                 set $x
>                 kill -9 $x
>         fi
> }

Gack.  Whoever wrote this should be shot.  Take out everything after the
"pg_ctl stop" line.  You might want to add "-m immediate" to the pg_ctl
command as a less dangerous way of encouraging a fast shutdown...

            regards, tom lane

Re: Found this in the server log on MAC OSX

От
David Garamond
Дата:
Tom Lane wrote:
>>StopService()
>>{
>>        ConsoleMessage "Stopping PostgreSQL database services"
>>        /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data
>>        x=`/bin/ps axc | /usr/bin/grep postgres`
>>        if /bin/test "$x"
>>        then
>>                set $x
>>                kill -9 $x
>>        fi
>>}
>
> Gack.  Whoever wrote this should be shot.  Take out everything after the
> "pg_ctl stop" line.  You might want to add "-m immediate" to the pg_ctl
> command as a less dangerous way of encouraging a fast shutdown...

BTW, I remember vaguely that the Cpanel 3 scripts (Cpanel is a
webhosting control panel product) used to do something like this for
every daemon... the horror...

--
dave