Обсуждение: 9.1.1 crash

Поиск
Список
Период
Сортировка

9.1.1 crash

От
Mike Blackwell
Дата:
The following are the relevant log entries from a recent crash of v9.1.1 running on an older RHEL Linux box.  This is the first crash we've experienced in a lot of years of running Pg.  Any assistance in how to determine what might have caused this is welcome.

--

2012-02-10 13:55:59 CST [15949]: [37-1] @ LOG:  00000: server process (PID 32670) was terminated by signal 11: Segmentation fault
2012-02-10 13:55:59 CST [15949]: [38-1] @ LOCATION:  LogChildExit, postmaster.c:2881
2012-02-10 13:55:59 CST [15949]: [39-1] @ LOG:  00000: terminating any other active server processes
2012-02-10 13:55:59 CST [15949]: [40-1] @ LOCATION:  HandleChildCrash, postmaster.c:2695
2012-02-10 13:55:59 CST [15949]: [41-1] @ LOG:  00000: all server processes terminated; reinitializing
2012-02-10 13:55:59 CST [15949]: [42-1] @ LOCATION:  PostmasterStateMachine, postmaster.c:3116
2012-02-10 13:56:00 CST [3303]: [1-1] @ LOG:  00000: database system was interrupted; last known up at 2012-02-10 13:54:18 CST
2012-02-10 13:56:00 CST [3303]: [2-1] @ LOCATION:  StartupXLOG, xlog.c:6046
2012-02-10 13:56:00 CST [3303]: [3-1] @ LOG:  00000: database system was not properly shut down; automatic recovery in progress
2012-02-10 13:56:00 CST [3303]: [4-1] @ LOCATION:  StartupXLOG, xlog.c:6299
2012-02-10 13:56:00 CST [3303]: [5-1] @ LOG:  00000: consistent recovery state reached at F/FC9C7588
2012-02-10 13:56:00 CST [3303]: [6-1] @ LOCATION:  CheckRecoveryConsistency, xlog.c:6958
2012-02-10 13:56:00 CST [3303]: [7-1] @ LOG:  00000: redo starts at F/FC9A5BA8
2012-02-10 13:56:00 CST [3303]: [8-1] @ LOCATION:  StartupXLOG, xlog.c:6506
2012-02-10 13:56:00 CST [3303]: [9-1] @ LOG:  00000: record with zero length at F/FCC716F0
2012-02-10 13:56:00 CST [3303]: [10-1] @ LOCATION:  ReadRecord, xlog.c:3829
2012-02-10 13:56:00 CST [3303]: [11-1] @ LOG:  00000: redo done at F/FCC716B4
2012-02-10 13:56:00 CST [3303]: [12-1] @ LOCATION:  StartupXLOG, xlog.c:6621
2012-02-10 13:56:00 CST [3303]: [13-1] @ LOG:  00000: last completed transaction was at log time 2012-02-10 13:55:59.452228-06
2012-02-10 13:56:00 CST [3303]: [14-1] @ LOCATION:  StartupXLOG, xlog.c:6626
2012-02-10 13:56:02 CST [3319]: [1-1] @ LOG:  00000: autovacuum launcher started
2012-02-10 13:56:02 CST [3319]: [2-1] @ LOCATION:  AutoVacLauncherMain, autovacuum.c:404
2012-02-10 13:56:02 CST [15949]: [43-1] @ LOG:  00000: database system is ready to accept connections
2012-02-10 13:56:02 CST [15949]: [44-1] @ LOCATION:  reaper, postmaster.c:2435

Re: 9.1.1 crash

От
"Albe Laurenz"
Дата:
Mike Blackwell wrote:
> The following are the relevant log entries from a recent crash of v9.1.1 running on an older RHEL
> Linux box.  This is the first crash we've experienced in a lot of years of running Pg.  Any assistance
> in how to determine what might have caused this is welcome.
> 
> 2012-02-10 13:55:59 CST [15949]: [37-1] @ LOG:  00000: server process (PID 32670) was terminated by
> signal 11: Segmentation fault
[...]

It is difficult to find out anything after the crash if the problem
cannot be reproduced.

If you happen to have changed the core file ulimit setting away from the
default zero, you should have a core file in the data directory which
can be used to create a backtrace which shows you where the server
crashed. And even that only really helps with a debug build.

Other than that, you could make sure that hard disk and memory have
no problem (you write that it is an older box). You can try to find
out what the server was doing at the time and if you can reproduce it.

Crashes are also often caused by nonstandard C funxtions that have
been loaded into the database.

Yours,
Laurenz Albe