Обсуждение: Start up error

Поиск
Список
Период
Сортировка

Start up error

От
"Hussain Jawad-FXRM43"
Дата:
Hello Sir,
=20
I have installed postgresql DB on my Linux server.
=20
When I am trying to stop and start the DB, below error is reported in
postgresql start up log file.
=20
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 25372) running in data directory
"/var/lib/pgsql/data"?
FATAL:  lock file "postmaster.pid" already exists
HINT:  Is another postmaster (PID 25372) running in data directory
"/var/lib/pgsql/data"?
=20
I have deleted the file postmaster.pid in the directory
/var/lib/pgsql/data for number of times and restarted the postgresql
service, but I am still not able to restart the server and the same
error is repeating again.
=20
Could you please suggest on this as the applications are completely
inaccessible due to this error.
=20
Thanks&Regards,
-Jawad.
=20
=20
=20

Re: Start up error

От
tomas@tuxteam.de
Дата:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sat, Nov 17, 2007 at 08:19:37AM +0800, Hussain Jawad-FXRM43 wrote:
> Hello Sir,
>
> I have installed postgresql DB on my Linux server.
>
> When I am trying to stop and start the DB, below error is reported in
> postgresql start up log file.
>
> FATAL:  lock file "postmaster.pid" already exists
> HINT:  Is another postmaster (PID 25372) running in data directory
> "/var/lib/pgsql/data"?
> FATAL:  lock file "postmaster.pid" already exists
> HINT:  Is another postmaster (PID 25372) running in data directory
> "/var/lib/pgsql/data"?

This doesn't look like a PostgreSQL bug. I'd suggest to post such
questions in other mailing lists. Have a look at

  <http://www.postgresql.org/community/lists/>

especially pgsql-general or pgsql-novice

> I have deleted the file postmaster.pid in the directory
> /var/lib/pgsql/data for number of times and restarted the postgresql
> service, but I am still not able to restart the server and the same
> error is repeating again.

- From your mail I gather that this postmaster.pid "appears" every time
you start the service. Things to check:

  (1) are you really sure the service is not running?
    (try e.g. ps wwaux | grep postmaster).

  (2a) if the answer to (1) is "service is not running", that would
      mean that the postmaster process starts, creates the PID file
      and dies unexpectedly. In this case: have you had a look at
      the relevant log files (typically in /var/log/postgresql --
      but it depends on distribution).

  (2a) if the answer to (1) is "is running", but your clients are not
      able to connect, perhaps the server is listening just on an unix
      domain socket while the clients are trying to contact it via an
      Internet socket.

HTH.

Regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)

iD8DBQFHPtO6Bcgs9XrR2kYRAiEAAJ4gt/IKJSHvB4kJmADWChXmXdwtSgCfW52W
CiMqL1XXWgcN2X0Wi+7wumI=
=g8Jw
-----END PGP SIGNATURE-----

Re: Start up error

От
Tom Lane
Дата:
"Hussain Jawad-FXRM43" <FXRM43@motorola.com> writes:
> When I am trying to stop and start the DB, below error is reported in
> postgresql start up log file.

Exactly what are you doing to start and stop the server?  If you are
using a startup script, whose is it?

> I have deleted the file postmaster.pid in the directory
> /var/lib/pgsql/data for number of times and restarted the postgresql
> service, but I am still not able to restart the server and the same
> error is repeating again.

The best theory that comes to mind is that your start procedure is
somehow starting multiple copies of the postmaster.  The first one
starts OK and then the second (and third?) ones fail with the
lockfile complaint --- as well they should.

I'm a bit afraid that your manual removals of the lockfile (which is
A Bad Idea as a rule) have left you with multiple versions of the
postmaster running sans lockfile.  This would be very bad because you
can easily end up with a corrupted database.  Have you looked around
with "ps" to verify that there really aren't any postgres-owned
processes left over after a "failed" start?

In particular, you should absolutely not see any lockfile complaints
unless the PID mentioned in the message is a live process.  What is
PID 25372 and what is it doing?

            regards, tom lane

Re: Start up error

От
"Hussain Jawad-FXRM43"
Дата:
Hi Tom,

Thank you for your time.

Below are the answers for your queries.

1)I am using  /etc/rc.d/init.d/postgresql stop/start to stop and start
the postgresql service.

2)There are several postgresql process running after a failed
restart.Below are the proccesess

  ps -ef|grep post
root       730 30758  0 12:22 pts/2    00:00:00 grep post
postgres  2700 15927 73 12:00 ?        00:16:15 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres  6706 15927 72 12:00 ?        00:15:42 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres 14351 15927 69 12:02 ?        00:13:38 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres 15927     1  0 11:58 ?        00:00:05 /usr/bin/postmaster -p
5432 -D /var/lib/pgsql/data
postgres 15936 15927  0 11:58 ?        00:00:00 postgres: logger process

postgres 15967 15927  0 11:58 ?        00:00:00 postgres: writer process

postgres 15968 15927  0 11:58 ?        00:00:00 postgres: stats buffer
process=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
postgres 15969 15968  0 11:58 ?        00:00:00 postgres: stats
collector process=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
postgres 16118 15927 73 11:58 ?        00:17:23 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres 23600 15927 65 12:08 ?        00:09:25 postgres: cscti csctools
[local] DELETE=20

3)There is no process running with PID 25372.

 I have been asked to reboot the server to fix this, what precautions
should be taken before reboot to get the postgresql DB working after
reboot.

 Is really reboot will fix the issue.Could you please suggest.


Thanks & Regards,

-Jawad.=20=20=20=20=20=20=20=20=20=20

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20
Sent: Saturday, November 17, 2007 9:48 PM
To: Hussain Jawad-FXRM43
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Start up error=20

"Hussain Jawad-FXRM43" <FXRM43@motorola.com> writes:
> When I am trying to stop and start the DB, below error is reported in=20
> postgresql start up log file.

Exactly what are you doing to start and stop the server?  If you are
using a startup script, whose is it?

> I have deleted the file postmaster.pid in the directory=20
> /var/lib/pgsql/data for number of times and restarted the postgresql=20
> service, but I am still not able to restart the server and the same=20
> error is repeating again.
=20
The best theory that comes to mind is that your start procedure is
somehow starting multiple copies of the postmaster.  The first one
starts OK and then the second (and third?) ones fail with the lockfile
complaint --- as well they should.

I'm a bit afraid that your manual removals of the lockfile (which is A
Bad Idea as a rule) have left you with multiple versions of the
postmaster running sans lockfile.  This would be very bad because you
can easily end up with a corrupted database.  Have you looked around
with "ps" to verify that there really aren't any postgres-owned
processes left over after a "failed" start?

In particular, you should absolutely not see any lockfile complaints
unless the PID mentioned in the message is a live process.  What is PID
25372 and what is it doing?

            regards, tom lane

Re: Start up error

От
"Hussain Jawad-FXRM43"
Дата:
=20
Below is the status of postgresql after restart

/etc/rc.d/init.d/postgresql status
postmaster (pid 25114 23600 16118 15969 15968 15967 15936 15927 14351
6706 2700) is running..

Thanks&Regards,

-Jawad

-----Original Message-----
From: Hussain Jawad-FXRM43=20
Sent: Tuesday, November 20, 2007 1:04 AM
To: 'Tom Lane'
Cc: pgsql-bugs@postgresql.org
Subject: RE: [BUGS] Start up error=20

Hi Tom,

Thank you for your time.

Below are the answers for your queries.

1)I am using  /etc/rc.d/init.d/postgresql stop/start to stop and start
the postgresql service.

2)There are several postgresql process running after a failed
restart.Below are the proccesess

  ps -ef|grep post
root       730 30758  0 12:22 pts/2    00:00:00 grep post
postgres  2700 15927 73 12:00 ?        00:16:15 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres  6706 15927 72 12:00 ?        00:15:42 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres 14351 15927 69 12:02 ?        00:13:38 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres 15927     1  0 11:58 ?        00:00:05 /usr/bin/postmaster -p
5432 -D /var/lib/pgsql/data
postgres 15936 15927  0 11:58 ?        00:00:00 postgres: logger process

postgres 15967 15927  0 11:58 ?        00:00:00 postgres: writer process

postgres 15968 15927  0 11:58 ?        00:00:00 postgres: stats buffer
process=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
postgres 15969 15968  0 11:58 ?        00:00:00 postgres: stats
collector process=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
postgres 16118 15927 73 11:58 ?        00:17:23 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres 23600 15927 65 12:08 ?        00:09:25 postgres: cscti csctools
[local] DELETE=20

3)There is no process running with PID 25372.

 I have been asked to reboot the server to fix this, what precautions
should be taken before reboot to get the postgresql DB working after
reboot.

 Is really reboot will fix the issue.Could you please suggest.


Thanks & Regards,

-Jawad.=20=20=20=20=20=20=20=20=20=20

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Saturday, November 17, 2007 9:48 PM
To: Hussain Jawad-FXRM43
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Start up error=20

"Hussain Jawad-FXRM43" <FXRM43@motorola.com> writes:
> When I am trying to stop and start the DB, below error is reported in=20
> postgresql start up log file.

Exactly what are you doing to start and stop the server?  If you are
using a startup script, whose is it?

> I have deleted the file postmaster.pid in the directory=20
> /var/lib/pgsql/data for number of times and restarted the postgresql=20
> service, but I am still not able to restart the server and the same=20
> error is repeating again.
=20
The best theory that comes to mind is that your start procedure is
somehow starting multiple copies of the postmaster.  The first one
starts OK and then the second (and third?) ones fail with the lockfile
complaint --- as well they should.

I'm a bit afraid that your manual removals of the lockfile (which is A
Bad Idea as a rule) have left you with multiple versions of the
postmaster running sans lockfile.  This would be very bad because you
can easily end up with a corrupted database.  Have you looked around
with "ps" to verify that there really aren't any postgres-owned
processes left over after a "failed" start?

In particular, you should absolutely not see any lockfile complaints
unless the PID mentioned in the message is a live process.  What is PID
25372 and what is it doing?

            regards, tom lane

Re: Start up error

От
"Hussain Jawad-FXRM43"
Дата:
Hi Tom,

Further to the below mail.

Below is the status of postgresql startup status.

/etc/rc.d/init.d/postgresql status
postmaster (pid 25114 23600 16118 15969 15968 15967 15936 15927 14351
6706 2700) is running...


Thanks&Regards,
-Jawad

-----Original Message-----
From: Hussain Jawad-FXRM43=20
Sent: Tuesday, November 20, 2007 1:04 AM
To: 'Tom Lane'
Cc: pgsql-bugs@postgresql.org
Subject: RE: [BUGS] Start up error=20

Hi Tom,

Thank you for your time.

Below are the answers for your queries.

1)I am using  /etc/rc.d/init.d/postgresql stop/start to stop and start
the postgresql service.

2)There are several postgresql process running after a failed
restart.Below are the proccesess

  ps -ef|grep post
root       730 30758  0 12:22 pts/2    00:00:00 grep post
postgres  2700 15927 73 12:00 ?        00:16:15 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres  6706 15927 72 12:00 ?        00:15:42 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres 14351 15927 69 12:02 ?        00:13:38 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres 15927     1  0 11:58 ?        00:00:05 /usr/bin/postmaster -p
5432 -D /var/lib/pgsql/data
postgres 15936 15927  0 11:58 ?        00:00:00 postgres: logger process

postgres 15967 15927  0 11:58 ?        00:00:00 postgres: writer process

postgres 15968 15927  0 11:58 ?        00:00:00 postgres: stats buffer
process=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
postgres 15969 15968  0 11:58 ?        00:00:00 postgres: stats
collector process=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
postgres 16118 15927 73 11:58 ?        00:17:23 postgres: cscti csctools
[local] DELETE=20=20=20=20=20=20=20=20=20=20=20
postgres 23600 15927 65 12:08 ?        00:09:25 postgres: cscti csctools
[local] DELETE=20

3)There is no process running with PID 25372.

 I have been asked to reboot the server to fix this, what precautions
should be taken before reboot to get the postgresql DB working after
reboot.

 Is really reboot will fix the issue.Could you please suggest.


Thanks & Regards,

-Jawad.=20=20=20=20=20=20=20=20=20=20

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Saturday, November 17, 2007 9:48 PM
To: Hussain Jawad-FXRM43
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] Start up error=20

"Hussain Jawad-FXRM43" <FXRM43@motorola.com> writes:
> When I am trying to stop and start the DB, below error is reported in=20
> postgresql start up log file.

Exactly what are you doing to start and stop the server?  If you are
using a startup script, whose is it?

> I have deleted the file postmaster.pid in the directory=20
> /var/lib/pgsql/data for number of times and restarted the postgresql=20
> service, but I am still not able to restart the server and the same=20
> error is repeating again.
=20
The best theory that comes to mind is that your start procedure is
somehow starting multiple copies of the postmaster.  The first one
starts OK and then the second (and third?) ones fail with the lockfile
complaint --- as well they should.

I'm a bit afraid that your manual removals of the lockfile (which is A
Bad Idea as a rule) have left you with multiple versions of the
postmaster running sans lockfile.  This would be very bad because you
can easily end up with a corrupted database.  Have you looked around
with "ps" to verify that there really aren't any postgres-owned
processes left over after a "failed" start?

In particular, you should absolutely not see any lockfile complaints
unless the PID mentioned in the message is a live process.  What is PID
25372 and what is it doing?

            regards, tom lane

Re: Start up error

От
Tom Lane
Дата:
"Hussain Jawad-FXRM43" <FXRM43@motorola.com> writes:
> 2)There are several postgresql process running after a failed
> restart.Below are the proccesess

>   ps -ef|grep post
> root       730 30758  0 12:22 pts/2    00:00:00 grep post
> postgres  2700 15927 73 12:00 ?        00:16:15 postgres: cscti csctools
> [local] DELETE
> postgres  6706 15927 72 12:00 ?        00:15:42 postgres: cscti csctools
> [local] DELETE
> postgres 14351 15927 69 12:02 ?        00:13:38 postgres: cscti csctools
> [local] DELETE
> postgres 15927     1  0 11:58 ?        00:00:05 /usr/bin/postmaster -p
> 5432 -D /var/lib/pgsql/data
> postgres 15936 15927  0 11:58 ?        00:00:00 postgres: logger process

> postgres 15967 15927  0 11:58 ?        00:00:00 postgres: writer process

> postgres 15968 15927  0 11:58 ?        00:00:00 postgres: stats buffer
> process
> postgres 15969 15968  0 11:58 ?        00:00:00 postgres: stats
> collector process
> postgres 16118 15927 73 11:58 ?        00:17:23 postgres: cscti csctools
> [local] DELETE
> postgres 23600 15927 65 12:08 ?        00:09:25 postgres: cscti csctools
> [local] DELETE

That hardly looks like a "failed restart".  I can't help wondering if
you were looking at the wrong log file.

            regards, tom lane