Обсуждение: Mysterious Death of postmaster (-9)

Поиск
Список
Период
Сортировка

Mysterious Death of postmaster (-9)

От
"Gregory S. Williamson"
Дата:
Dear peoples,

We had an oddness today with one of of postgres servers (Dell 2 CPU box running linux) and postgres 7.4. The server was
underheavy load (50+ for a 1 minutes spike; about 20 for the 15 minute average) with about 250 connections (we still
don'tunderstand the heavy load itself). 

Looking in the logs I see:
2004-11-13 13:30:28 LOG:  unexpected EOF on client connection
2004-11-13 13:30:40 LOG:  unexpected EOF on client connection
2004-11-13 13:38:28 LOG:  could not send data to client: Broken pipe
2004-11-13 13:42:15 LOG:  server process (PID 30272) was terminated by signal 9
2004-11-13 13:42:16 LOG:  terminating any other active server processes
2004-11-13 13:42:16 WARNING:  terminating connection because of crash of another
 server process

The EOFs are almost certainly Proxool closing connections from the client to the database.

The sysad who was on call today swears he didn't send a kill signal (or any signal at all) -- suddenly the load dropped
offand the server was down. It has restarted normally and shows no signs of being worse for the wear (this is really a
read-onlydb so data corruption chances are minimal, I think). 

Just to rule out any internal chances, is there any way this shutdown could have been triggered from within postgres
itself? Can anyone construct any scenarios in which Linux, postgres or proxool could have done this without human
intervention? 

I have looked through manuals and some FAQs and newsgroup discussions and my gut feeling is that this can't be from
postgres,but I thought I'd ask in the chance that I am, as is often the case, Unclear On The Concept. 

Thanks for any illumination,

Greg Williamson
DBA
GlobeXplorer LLC

ps if this is not the right list please let know what might be an appropriate one. gracias!

Re: Mysterious Death of postmaster (-9)

От
Stephan Szabo
Дата:
On Sat, 13 Nov 2004, Gregory S. Williamson wrote:

> Looking in the logs I see:
> 2004-11-13 13:30:28 LOG:  unexpected EOF on client connection
> 2004-11-13 13:30:40 LOG:  unexpected EOF on client connection
> 2004-11-13 13:38:28 LOG:  could not send data to client: Broken pipe
> 2004-11-13 13:42:15 LOG:  server process (PID 30272) was terminated by signal 9
> 2004-11-13 13:42:16 LOG:  terminating any other active server processes
> 2004-11-13 13:42:16 WARNING:  terminating connection because of crash of another
>  server process

> Just to rule out any internal chances, is there any way this shutdown
> could have been triggered from within postgres itself ? Can anyone
> construct any scenarios in which Linux, postgres or proxool could have
> done this without human intervention ?

Is it possible that you ran into the out of memory killer?  That's the
most likely thing beyond admin intervention I can think of.


Re: Mysterious Death of postmaster (-9)

От
Alvaro Herrera
Дата:
On Sat, Nov 13, 2004 at 02:39:38PM -0800, Gregory S. Williamson wrote:

Gregory,

> We had an oddness today with one of of postgres servers (Dell 2 CPU box
> running linux) and postgres 7.4. The server was under heavy load (50+ for a 1
> minutes spike; about 20 for the 15 minute average) with about 250 connections
> (we still don't understand the heavy load itself).
>
> Looking in the logs I see:
> 2004-11-13 13:30:28 LOG:  unexpected EOF on client connection
> 2004-11-13 13:30:40 LOG:  unexpected EOF on client connection
> 2004-11-13 13:38:28 LOG:  could not send data to client: Broken pipe
> 2004-11-13 13:42:15 LOG:  server process (PID 30272) was terminated by signal 9

This looks an awful lot like the Linux Out-Of-Memory killer got you.
This happens when the Linux kernel overcommits memory.  There is something
about this on the documentation, and has been discussed in the past
here.  Please see the archives (www.pgsql.ru; look for "OOM killer" and
"linux overcommit").

Luckily it didn't get your postmaster, as has happenned to other
people ...

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"XML!" Exclaimed C++.  "What are you doing here? You're not a programming
language."
"Tell that to the people who use me," said XML.

Re: Mysterious Death of postmaster (-9)

От
"Gregory S. Williamson"
Дата:
Thanks Alvaro and Steven -- this may in fact be what happened as the monitor showed that at about that time memory
definitelywas taxed and showed oddnesses. 

I'll read up on this -- thanks very much for the (promising) clue!

Greg W.


-----Original Message-----
From:    Alvaro Herrera [mailto:alvherre@dcc.uchile.cl]
Sent:    Sat 11/13/2004 3:06 PM
To:    Gregory S. Williamson
Cc:    pgsql-general@postgresql.org
Subject:    Re: [GENERAL] Mysterious Death of postmaster (-9)
On Sat, Nov 13, 2004 at 02:39:38PM -0800, Gregory S. Williamson wrote:

Gregory,

> We had an oddness today with one of of postgres servers (Dell 2 CPU box
> running linux) and postgres 7.4. The server was under heavy load (50+ for a 1
> minutes spike; about 20 for the 15 minute average) with about 250 connections
> (we still don't understand the heavy load itself).
>
> Looking in the logs I see:
> 2004-11-13 13:30:28 LOG:  unexpected EOF on client connection
> 2004-11-13 13:30:40 LOG:  unexpected EOF on client connection
> 2004-11-13 13:38:28 LOG:  could not send data to client: Broken pipe
> 2004-11-13 13:42:15 LOG:  server process (PID 30272) was terminated by signal 9

This looks an awful lot like the Linux Out-Of-Memory killer got you.
This happens when the Linux kernel overcommits memory.  There is something
about this on the documentation, and has been discussed in the past
here.  Please see the archives (www.pgsql.ru; look for "OOM killer" and
"linux overcommit").

Luckily it didn't get your postmaster, as has happenned to other
people ...

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"XML!" Exclaimed C++.  "What are you doing here? You're not a programming
language."
"Tell that to the people who use me," said XML.