RE: Startup Failure (stop failure ?)

Поиск
Список
Период
Сортировка
От Neil Toronto
Тема RE: Startup Failure (stop failure ?)
Дата
Msg-id 14A4DCD7F3CED3118749009027DCBFE49D71E9@smtp.stsrvcs.com
обсуждение исходный текст
Список pgsql-admin
That usually happens when a process is in "uninterruptible sleep."  ps shows
its status as D, and you can't kill it in any way.  Uninterruptible sleep
generally happens when a hardware driver is waiting on I/O from some
hardware device.

It's uninterruptible because the I/O is happening in kernel space - some
kernel function that Postgres called on to do something, like read from the
disk - and if the process were killed, the kernel function would try to
return to it anyway.  That's very bad, and it's the reason you can't kill
that process.  (I suppose you could circumvent that by unloading the kernel,
but I'm not sure that's such a good idea, either. ;) )

It's basically just luck after that.  If the I/O waiting times out or
completes, you're okay.  If not, you'll have to reboot, and your process
never gets killed (even by init), which leaves open files because your
process never got to close them.  That's why it doesn't unmount cleanly.

Anyway, if that kind of thing happens, it's likely that 1) the kernel module
that controls whatever I/O device never responded is broken, 2) the process
that uses the functions in the kernel module is using them in a very wrong
way, or 3) the hardware device is broken somehow.

It's happened to me twice - once with a broken ethernet card, and once when
I was trying XFree86 4.0.

As to the original question: if bind() fails, it's likely that the process
is still running - even if its status is D.  When a process is truly killed,
any port that it's bound to is automatically unbound.  If it _wasn't_
running and bind() failed, I'd say something got a bit screwy in your
kernel, and wonder what in the HECK you did to get it into that state.
(It's never the kernel's fault, right?  Even if it's 2.3?)

Hope that helps.
Neil

-----Original Message-----
From: Michael Holopainen [mailto:michael@laserle.fi]
Sent: Monday, July 10, 2000 2:46 AM
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Startup Failure (stop failure ?)


A stupid question :
did you see if there was a postmaster process running ?


I had this problem that I was unable to kill the postmaster process
(with 'kill -9 ...')
1. I stooped the postgres with '/etc/rc.d/postgres stop' that reported
that it was done ok. Then when I tried to start it again a got
'postmaster allready running' and 'ps aux | grep postmaster' I found out
that it was still running.
2. I tried with 'kill -9' -> still running.
3. then I tried init 2 -> init 1 -> init S  = still running
4. reboot reported that postgres directory was still in use when it
unmounted it.

!!!??????

It has not reoccurred. it's been week now.


Joel Lansden wrote:
>
> Greetings all -
>
> when I attempt to start my Postgresql-7.0.2 server I receive the following
> error:
>
> FATAL: StreamServerPort: bind() failed: Operation not permitted
>          Is another postmaster already running on that port?
>          If not, remove socket node (/tmp/.s.PGSQL.5432) and retry.
> /usr/local/pgsql/bin/postmaster: cannot create UNIX stream port
>
> I am running Mandrake Linux 7.0 with kernel 2.3.50 on a dual Pentium 200
> MMX system, 256Mb RAM.
>
> Thanks in advance for any help!
>
> Joel

--
   --"Would you fly on airplane controlled by MS Windows ?"--
--------------------------------------------------------------------
| Michael Holopainen | Valuraudantie 25 | Tel: +358-(0)9-35093825  |
|                    | 00700 Helsinki   | Fax : +358-(0)9-35093850 |
| Laserle Oy         | Finland          | email: michael@laserle.fi|
--------------------------------------------------------------------

В списке pgsql-admin по дате отправления:

Предыдущее
От: Karl Schmid
Дата:
Сообщение: Problem with installing the data directory
Следующее
От: "XWorkers"
Дата:
Сообщение: Replication