Обсуждение: Re: sick DB - ??
As a followup - the line from top: 1641 postgres 105 0 2684K 1384K CPU1 0 8:26 99.02% 99.02% postgres As you can see, it's barely taking up any RAM - the process is going nuts right off the bat.. On Wed, 18 Jul 2001, Pete Leonard wrote: > > Postgres 7.1.2, FreeBSD 3.4 > > Box got sick, had to bounce it. Postgres wasn't brought down in a > graceful fashion.. > > restart didn't bring the DB back properly, so as the postgres user, did > the following: > > /usr/local/pgsql/bin/postmaster -d5 start > > it dumps the initial environment variables, and then returns nothing. CPU > is pegged at 100%. No reporting, no information as to what's happening. > > Solutions? It the DB corrupted badly? Where do I go from here? > > thanks, > > --pete > > >
Followup ^2 -
The reason this happened was that for whatever reason (we're still
investigating), /tmp was writeable only by root.
I only noticed this when using initdb to create a new data directory.
postmaster offered no suggestion that there was a problem here, even when
running at -d5.
chmod 777 /tmp fixed everything.
my best guess (I don't know how postmaster is operating, I didn't run any
of the system-level diagnostic tools to check) is that if postmaster fails
on opening a pipe/tmpfile, rather than check the error properly, it
changes the filename and tries again ad infinitum? Perhaps printing some
error code (especially at debug level 5) would help?
thanks,
--pete
On Wed, 18 Jul 2001, Pete Leonard wrote:
>
> As a followup - the line from top:
>
> 1641 postgres 105 0 2684K 1384K CPU1 0 8:26 99.02% 99.02%
> postgres
>
> As you can see, it's barely taking up any RAM - the process is going nuts
> right off the bat..
>
> On Wed, 18 Jul 2001, Pete Leonard wrote:
>
> >
> > Postgres 7.1.2, FreeBSD 3.4
> >
> > Box got sick, had to bounce it. Postgres wasn't brought down in a
> > graceful fashion..
> >
> > restart didn't bring the DB back properly, so as the postgres user, did
> > the following:
> >
> > /usr/local/pgsql/bin/postmaster -d5 start
> >
> > it dumps the initial environment variables, and then returns nothing. CPU
> > is pegged at 100%. No reporting, no information as to what's happening.
> >
> > Solutions? It the DB corrupted badly? Where do I go from here?
> >
> > thanks,
> >
> > --pete
> >
> >
> >
>
>
Pete Leonard <pete@hero.com> writes:
>> restart didn't bring the DB back properly, so as the postgres user, did
>> the following:
>> /usr/local/pgsql/bin/postmaster -d5 start
>> it dumps the initial environment variables, and then returns nothing. CPU
>> is pegged at 100%. No reporting, no information as to what's happening.
This is kind of a random guess, but we recently noticed that 7.1 has a
bug whereby the postmaster can go into an infinite loop at startup if
the $PGDATA directory is not writable. Check permissions. It might
also be a good idea to remove the old postmaster.pid file by hand.
regards, tom lane
On Wed, Jul 18, 2001 at 09:36:38AM -0700, Pete Leonard wrote:
> chmod 777 /tmp fixed everything.
That should be 1777.
mrc
--
Mike Castle dalgoda@ix.netcom.com www.netcom.com/~dalgoda/
We are all of us living in the shadow of Manhattan. -- Watchmen
fatal ("You are in a maze of twisty compiler features, all different"); -- gcc
Pete Leonard <pete@hero.com> writes:
> The reason this happened was that for whatever reason (we're still
> investigating), /tmp was writeable only by root.
Ah. Hadn't thought about it before, but the infinite-loop-on-
nonwritable-$PGDATA bug would also trigger for nonwritable /tmp.
(The bug was actually in CreateLockFile, which is used both to
create a lockfile in $PGDATA and one in /tmp. Sigh.)
This is fixed in current sources. If we were going to do a 7.1.3
then I'd backpatch the fix into the REL7_1 branch, but at this point
I suspect there won't be a 7.1.3 --- we'll probably go into 7.2 beta
in another five or six weeks, so there's not much point.
regards, tom lane