Обсуждение: it refuses to go down...

Поиск
Список
Период
Сортировка

it refuses to go down...

От
"Tena Sakai"
Дата:

Hi Everybody,

I ran out of space over the weekend.  /usr/local/ filled up.
I got a new partition and about 1/2 done with copying the
data from the filled-up partition.  BTW, I am using postgres
8.3.0 on dell hardware/ redhat advanced server.

I did:
  $ pg_ctl status
  pg_ctl: server is running (PID: 11148)
  /usr/local/pgsql/bin/postgres

and proceeded with:

  $ pg_ctl stop -m fast
  waiting for server to shut down............................................................... failed
  pg_ctl: server does not shut down

My plan was to shutdown, make sure the env variables are
set correctly and restart.  But it refuses to go down as
shown above.  I then tried to use psql:
  $ psql canon

and it tells me:
  psql: FATAL:  the database system is shutting down

Can I kill the process 11148 and do as I planned accordingly?

I would appreciate any pointers.

Regards,

Tena Sakai
tsakai@gallo.ucsf.edu

Re: it refuses to go down...

От
Tom Lane
Дата:
"Tena Sakai" <tsakai@gallo.ucsf.edu> writes:
>   $ pg_ctl stop -m fast
>   waiting for server to shut down............................................................... failed
>   pg_ctl: server does not shut down

Have you looked into the postmaster log to see *why* it's not shutting
down?

> Can I kill the process 11148 and do as I planned accordingly?

It's not exactly recommended when you don't know what the problem is.

Killing only the postmaster and not its child processes is especially
not recommended --- there are some defenses in place against that,
but nothing that can't be broken by a sufficiently determined DBA.

            regards, tom lane

Re: it refuses to go down...

От
salman
Дата:
That's what she said.

Someone had to do it.

That's what she said, again.

Sorry.

-salman

Re: it refuses to go down...

От
"Scott Marlowe"
Дата:
On Mon, Mar 24, 2008 at 3:48 PM, Tena Sakai <tsakai@gallo.ucsf.edu> wrote:
>
> Hi Everybody,
>
>  I ran out of space over the weekend.  /usr/local/ filled up.
>  I got a new partition and about 1/2 done with copying the
>  data from the filled-up partition.  BTW, I am using postgres
>  8.3.0 on dell hardware/ redhat advanced server.
>
>  I did:
>    $ pg_ctl status
>    pg_ctl: server is running (PID: 11148)
>    /usr/local/pgsql/bin/postgres
>
>  and proceeded with:
>
>    $ pg_ctl stop -m fast
>    waiting for server to shut
> down............................................................... failed
>    pg_ctl: server does not shut down


did you try pg_ctl -m immediate stop   ???

Re: it refuses to go down...

От
"Tena Sakai"
Дата:

Hi Tom,

> Have you looked into the postmaster log to see
> *why* it's not shutting down?

I did tail on the log and got...

[2008-03-24 14:31:39.331 PDT] < 11148  2008-02-26 18:15:00 PST >LOG:  received smart shutdown request
[2008-03-24 14:31:39.331 PDT] < 29578  2008-03-24 06:38:33 PDT >LOG:  autovacuum launcher shutting down
[2008-03-24 14:32:58.740 PDT] < 11148  2008-02-26 18:15:00 PST >LOG:  received fast shutdown request
[2008-03-24 14:46:03.123 PDT] <postgres 8619 127.0.0.1 2008-03-24 14:46:03 PDT /usr/local/pgsql/bin/postgres>FATAL:  the database system is shutting down
[2008-03-24 15:52:43.520 PDT] <michaelstevens 11367 [local] 2008-03-24 15:52:43 PDT /usr/local/pgsql/bin/postgres>FATAL:  the database system is shutting down
[2008-03-24 15:52:43.563 PDT] <michaelstevens 11368 [local] 2008-03-24 15:52:43 PDT /usr/local/pgsql/bin/postgres>FATAL:  the database system is shutting down

I am about to try:
pg_ctl stop -m immediate

Regards,

Tena Sakai


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Mon 3/24/2008 3:11 PM
To: Tena Sakai
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] it refuses to go down...

"Tena Sakai" <tsakai@gallo.ucsf.edu> writes:
>   $ pg_ctl stop -m fast
>   waiting for server to shut down............................................................... failed
>   pg_ctl: server does not shut down

Have you looked into the postmaster log to see *why* it's not shutting
down?

> Can I kill the process 11148 and do as I planned accordingly?

It's not exactly recommended when you don't know what the problem is.

Killing only the postmaster and not its child processes is especially
not recommended --- there are some defenses in place against that,
but nothing that can't be broken by a sufficiently determined DBA.

                        regards, tom lane

Re: it refuses to go down...

От
"Tena Sakai"
Дата:

Hi Scott,

> did you try pg_ctl -m immediate stop   ???

I just did, and it worked.

  $ pg_ctl stop -m immediate
  waiting for server to shut down.... done
  server stopped

Many thanks.

Regards,

Tena Sakai
tsakai@gallo.ucsf.edu


-----Original Message-----
From: Scott Marlowe [mailto:scott.marlowe@gmail.com]
Sent: Mon 3/24/2008 3:37 PM
To: Tena Sakai
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] it refuses to go down...

On Mon, Mar 24, 2008 at 3:48 PM, Tena Sakai <tsakai@gallo.ucsf.edu> wrote:
>
> Hi Everybody,
>
>  I ran out of space over the weekend.  /usr/local/ filled up.
>  I got a new partition and about 1/2 done with copying the
>  data from the filled-up partition.  BTW, I am using postgres
>  8.3.0 on dell hardware/ redhat advanced server.
>
>  I did:
>    $ pg_ctl status
>    pg_ctl: server is running (PID: 11148)
>    /usr/local/pgsql/bin/postgres
>
>  and proceeded with:
>
>    $ pg_ctl stop -m fast
>    waiting for server to shut
> down............................................................... failed
>    pg_ctl: server does not shut down


did you try pg_ctl -m immediate stop   ???

Re: it refuses to go down...

От
"Peter Koczan"
Дата:
>  > did you try pg_ctl -m immediate stop   ???
>
>  I just did, and it worked.
>
>    $ pg_ctl stop -m immediate
>    waiting for server to shut down.... done
>    server stopped

I'd be careful about shutting down using "immediate" mode. It forces
the database into recovery mode.

Your problem could be that one or two connections are in a weird state
and even "fast" stopping can't kill them. Next time you have to
restart the server, you should check on the status of connections and
see if any are in a weird state. I had to deal with this recently
where the status was "notify interrupt" and I couldn't even stop fast.
I had to change some application code, but not much.

Just run "ps ax | grep post" (or whatever options you like to give ps
to show all processes) to filter out postgres processes. A connection
entry will look like like.

[pid] ?        Ss      0:00 postgres: [user] [database] [client] [status]

Peter

Re: it refuses to go down...

От
"Tena Sakai"
Дата:

Hello Peter,

Thanks for your post.  I appreciate your concern.

> I had to deal with this recently where the
> status was "notify interrupt"

Is this a response from "pg_ctl status" command?

At the moment, I see nothing alarming via ps:

  $ ps ax | grep post
  11807 ?        Ss     0:00 sshd: postgres [priv]
  11809 ?        S      0:00 sshd: postgres@pts/5
  12241 ?        S      0:00 /pgsql/pgsql/bin/postgres
  12243 ?        Ss     2:33 postgres: writer process  
  12244 ?        Ss     0:00 postgres: wal writer process  
  12245 ?        Ss     0:00 postgres: autovacuum launcher process  
  12246 ?        Ss     0:50 postgres: stats collector process  
  13814 ?        SNl  426:30 java -Xms128m -Xmx4600m -jar gadb.jar -loadonly jdbc:db2://172.16.1.109:50001/sci2p mitch winkwink jdbc:postgresql://vixen/canon?prepareThreshold=1 gadb northpole 10
  13828 ?        Ss   234:43 postgres: gadb canon 127.0.0.1(44824) idle in transaction
  13830 ?        Ss   235:51 postgres: gadb canon 127.0.0.1(44826) idle in transaction
  13832 ?        Ss     0:16 postgres: gadb canon 127.0.0.1(44828) idle in transaction
  13833 ?        Ss   235:03 postgres: gadb canon 127.0.0.1(44829) idle in transaction
  13835 ?        Ss   234:54 postgres: gadb canon 127.0.0.1(44831) idle
  13837 ?        Ss   235:48 postgres: gadb canon 127.0.0.1(44833) idle
  13839 ?        Ss   235:12 postgres: gadb canon 127.0.0.1(44835) idle
  13841 ?        Rs   234:32 postgres: gadb canon 127.0.0.1(44837) SELECT
  13843 ?        Ss   234:03 postgres: gadb canon 127.0.0.1(44839) idle in transaction
  13845 ?        Rs   234:20 postgres: gadb canon 127.0.0.1(44841) SELECT
  13847 ?        Ss   235:29 postgres: gadb canon 127.0.0.1(44843) idle in transaction
  14526 ?        Ss    63:15 postgres: ysu canon 127.0.0.1(44853) idle
  16563 ?        Ss     2:06 postgres: autovacuum worker process   canon
  21201 pts/8    S+     0:00 grep post

Regards,

Tena Sakai
tsakai@gallo.ucsf.edu


-----Original Message-----
From: Peter Koczan [mailto:pjkoczan@gmail.com]
Sent: Tue 3/25/2008 10:59 AM
To: Tena Sakai
Cc: Scott Marlowe; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] it refuses to go down...

>  > did you try pg_ctl -m immediate stop   ???
>
>  I just did, and it worked.
>
>    $ pg_ctl stop -m immediate
>    waiting for server to shut down.... done
>    server stopped

I'd be careful about shutting down using "immediate" mode. It forces
the database into recovery mode.

Your problem could be that one or two connections are in a weird state
and even "fast" stopping can't kill them. Next time you have to
restart the server, you should check on the status of connections and
see if any are in a weird state. I had to deal with this recently
where the status was "notify interrupt" and I couldn't even stop fast.
I had to change some application code, but not much.

Just run "ps ax | grep post" (or whatever options you like to give ps
to show all processes) to filter out postgres processes. A connection
entry will look like like.

[pid] ?        Ss      0:00 postgres: [user] [database] [client] [status]

Peter

Re: it refuses to go down...

От
"Peter Koczan"
Дата:
>  > I had to deal with this recently where the
>  > status was "notify interrupt"
>
>  Is this a response from "pg_ctl status" command?

I'm referring to the last field of "ps ax". For instance, this line...

>    13841 ?        Rs   234:32 postgres: gadb canon 127.0.0.1(44837) SELECT

would have "notify interrupt" instead of "SELECT". That's the status
I'm referring to.

Peter

Re: it refuses to go down...

От
"Tena Sakai"
Дата:

Many thanks, Peter.

Regards,

Tena Sakai
tsakai@gallo.ucsf.edu



-----Original Message-----
From: Peter Koczan [mailto:pjkoczan@gmail.com]
Sent: Thu 3/27/2008 2:59 PM
To: Tena Sakai
Cc: Scott Marlowe; pgsql-admin@postgresql.org
Subject: Re: [ADMIN] it refuses to go down...

>  > I had to deal with this recently where the
>  > status was "notify interrupt"
>
>  Is this a response from "pg_ctl status" command?

I'm referring to the last field of "ps ax". For instance, this line...

>    13841 ?        Rs   234:32 postgres: gadb canon 127.0.0.1(44837) SELECT

would have "notify interrupt" instead of "SELECT". That's the status
I'm referring to.

Peter