Re: Database server restarting

Поиск
Список
Период
Сортировка
От Nigel J. Andrews
Тема Re: Database server restarting
Дата
Msg-id Pine.LNX.4.21.0305060832030.10245-100000@ponder.fairway2k.co.uk
обсуждение исходный текст
Ответ на Re: Database server restarting  ("shoaib" <shoaibm@vmoksha.com>)
Ответы Re: Database server restarting  ("shoaib" <shoaibm@vmoksha.com>)
Список pgsql-general
On Tue, 6 May 2003, shoaib wrote:

> There are some cron jobs running at the same time...
> One server does SSH into our application server and on cron job is
> reading the DB and writing some data into flat files. But by the time
> this problem is happening these jobs are not writing any data. Last
> night when the server went down the other server wa trying to do SsH and
> probably it was running some cron job and a heavy DB process was
> running.I can not do a top bcoz I can not login into server even from
> console.

Do you mean you have no log in priviledges on to the machine or you are only
trying to login once you see a problem? If the former then I can't see how
there's any way you can make progress with this. If the later, forget that,
that's not helping since you are unable to get the processes running. What you
should do is log in _now_, run 'top' and leave it running. It may be that when
the problem occurs the session running the top will stop and so show the
information from that time. However, it may also be that it doesn't stop and
when you come into the office n hours later you find it merrily ticking away
showing you the current information. Therefore, investigate ways to log
the information if you aren't sat there when the problem is occuring.

Also take a look at procinfo, it may be helpful as well.

One thing that might be a problem is the number of open file descriptors, you
could be running into the system limit of those. That sort of thing can
sometimes make a system unstable.

I'd still be interested to know whether the hardware has been tested
properly. Is there any known problems for RH 7.3's kernel and your particular
hardware, such as the RAID device?

One interesting thing you say though; the same thing happens on a second
server. That to me suggests either something like a kernel/hardware problem
such as the RAID or you have a bug in your own software. Perhaps an endless
loop? Perhaps an endless trying to obtain a file descriptor? A heavy cpu usage
process shouldn't bring the machine down but it can make it look very
unresponsive.


>
> Regards
> shaoib
>
>
> -----Original Message-----
> From: Martijn van Oosterhout [mailto:kleptog@svana.org]
> Sent: Tuesday, May 06, 2003 2:40 PM
> To: shoaib
> Cc: gearond@cvc.net; 'Nigel J. Andrews'; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Database server restarting
>
> On Tue, May 06, 2003 at 02:28:57PM +0800, shoaib wrote:
> > When I say hangs it means ..I am not even able to login at the server
> > console also.
> > No ssh, no login form remote machines.
>
> Well, that's not postgresql's fault. It can't hang a machine like that.
> You
> should look elsewhere for the exact cause. I'm assuming here that
> consoles
> that are still logged in don't respond either? Maybe leave a top running
> to
> capture the list of processes just before it dies? Any cronjobs about
> the
> time it dies?
>
> What other processes run at about that time?
>

--
Nigel J. Andrews


В списке pgsql-general по дате отправления:

Предыдущее
От: Shridhar Daithankar
Дата:
Сообщение: Re: Database server restarting
Следующее
От: Manfred Koizar
Дата:
Сообщение: Re: how to restrict inner results in OUTER JOIN?