Обсуждение: stuck spinlock

Поиск
Список
Период
Сортировка

stuck spinlock

От
Jeff Waugh
Дата:
I have a database that was busy when it's server was rebooted.

Following is the output of postmaster -d4. It runs for about 15 minutes
and then core dumps.

Any advice how I can get this database started again?

It is 7.1.3 on FreeBSD 4.2.

Thanks,
-Jeff


bin/postmaster: PostmasterMain: initial environ dump:
-----------------------------------------
    USER=postgres
    MAIL=/var/mail/postgres
    HOME=/usr/local/pgsql
    TERM=xterm
    BLOCKSIZE=K
    PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/games:/usr/local/sbin:/usr/local/bin:/usr/X11R6/bin:/usr/local/pgsql/bin
    SHELL=/bin/sh
    FTP_PASSIVE_MODE=YES
-----------------------------------------
invoking IpcMemoryCreate(size=9232384)
FindExec: found "/usr/local/pgsql/bin/postmaster" using argv[0]
DEBUG:  database system shutdown was interrupted at 2001-11-22 04:24:53 EST
DEBUG:  CheckPoint record at (13, 2652568836)
DEBUG:  Redo record at (13, 2652568836); Undo record at (0, 0); Shutdown TRUE
DEBUG:  NextTransactionId: 56354001; NextOid: 205527560
DEBUG:  database system was not properly shut down; automatic recovery in progress...
DEBUG:  ReadRecord: record with zero len at (13, 2652568900)
DEBUG:  redo is not required

FATAL: s_lock(0x30059219) at bufmgr.c:2048, stuck spinlock. Aborting.

FATAL: s_lock(0x30059219) at bufmgr.c:2048, stuck spinlock. Aborting.
bin/postmaster: reaping dead processes...
bin/postmaster: Startup proc 3436 exited with status 134 - abort


Re: stuck spinlock

От
Tom Lane
Дата:
Jeff Waugh <jaw@ic.net> writes:
> I have a database that was busy when it's server was rebooted.
> Following is the output of postmaster -d4. It runs for about 15 minutes
> and then core dumps.

Do you want to let someone get in there and try to debug it?  I suspect
you are seeing some sort of bug in the WAL recovery code, but there's no
way to find and fix it with only this much info.  Somebody would need to
go in with a debugger.

> Any advice how I can get this database started again?

If "getting up pronto" is more important than anything else, you could
try resetting the xlog (see contrib/pg_resetxlog), and then manually
scratching around to look for consistency problems.  But please, before
you do that, take a backup (eg tarfile dump) of your entire $PGDATA
tree, so that we have the evidence available for debugging
investigations.

            regards, tom lane