production server down

Поиск
Список
Период
Сортировка
От Joe Conway
Тема production server down
Дата
Msg-id 41BFAB7C.5040108@joeconway.com
обсуждение исходный текст
Ответы Re: production server down
Re: production server down
Список pgsql-hackers
I've got a down production server (will not restart) with the following 
tail to its log file:

2004-12-13 15:05:52 LOG:  recycled transaction log file "000001650000004C"
2004-12-13 15:26:01 LOG:  recycled transaction log file "000001650000004D"
2004-12-13 16:39:55 LOG:  database system was shut down at 2004-11-02 
17:05:33 PST
2004-12-13 16:39:55 LOG:  checkpoint record is at 0/9B0B8C
2004-12-13 16:39:55 LOG:  redo record is at 0/9B0B8C; undo record is at 
0/0; shutdown TRUE
2004-12-13 16:39:55 LOG:  next transaction ID: 536; next OID: 17142
2004-12-13 16:39:55 LOG:  database system is ready
2004-12-14 15:36:20 FATAL:  IDENT authentication failed for user "colprod"
2004-12-14 15:36:58 FATAL:  IDENT authentication failed for user "colprod"
2004-12-14 15:39:26 LOG:  received smart shutdown request
2004-12-14 15:39:26 LOG:  shutting down
2004-12-14 15:39:28 PANIC:  could not open file 
"/replica/pgdata/pg_xlog/0000000000000000" (log file 0, segment 0): No 
such file or directory
2004-12-14 15:39:28 LOG:  shutdown process (PID 23202) was terminated by 
signal 6
2004-12-14 15:39:39 LOG:  database system shutdown was interrupted at 
2004-12-14 15:39:26 PST
2004-12-14 15:39:39 LOG:  could not open file 
"/replica/pgdata/pg_xlog/0000000000000000" (log file 0, segment 0): No 
such file or directory
2004-12-14 15:39:39 LOG:  invalid primary checkpoint record
2004-12-14 15:39:39 LOG:  could not open file 
"/replica/pgdata/pg_xlog/0000000000000000" (log file 0, segment 0): No 
such file or directory
2004-12-14 15:39:39 LOG:  invalid secondary checkpoint record
2004-12-14 15:39:39 PANIC:  could not locate a valid checkpoint record
2004-12-14 15:39:39 LOG:  startup process (PID 23298) was terminated by 
signal 6
2004-12-14 15:39:39 LOG:  aborting startup due to startup process failure


This is a SuSE 9, 8-way Xeon IBM x445, with nfs mounted Network 
Appliance for database storage, postgresql-7.4.5-36.4.

The server experienced a hang (as yet unexplained) yesterday and was 
restarted at 2004-12-13 16:38:49 according to syslog. I'm told by the 
network admin that there was a problem with the network card on restart, 
so the nfs mount most probably disappeared and then reappeared 
underneath a quiescent postgresql at some point between 2004-12-13 
16:39:55 and 2004-12-14 15:36:20 (but much closer to the former than the 
latter).

Any help would be much appreciated. Is our only option pg_resetxlog?

Thanks,

Joe






В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: [Testperf-general] BufferSync and bgwriter
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: production server down