db crash, streaming rep slave will not start

Поиск
Список
Период
Сортировка
От CS DBA
Тема db crash, streaming rep slave will not start
Дата
Msg-id 51F84A14.4000801@consistentstate.com
обсуждение исходный текст
Список pgsql-admin
Hi All;

A client's master database crashed, they tried to startup the streaming replication slave and it refuses to start.

See the log details below... thanks in advance for any help





Master log:

2013-07-30 16:23:01 MDT PANIC: corrupted page pointers: lower = 0, upper = 0, special = 0

2013-07-30 16:23:02 MDT LOG: server process (PID 17539) was terminated by signal 6: Abort trap

2013-07-30 16:23:02 MDT LOG: terminating any other active server processes


2013-07-30 16:23:02 MDT [local]WARNING: terminating connection because of crash of another server process

2013-07-30 16:23:02 MDT [local]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2013-07-30 16:23:02 MDT [local]HINT: In a moment you should be able to reconnect to the database and repeat your command.

2013-07-30 16:23:02 MDT [local]WARNING: terminating connection because of crash of another server process

2013-07-30 16:23:02 MDT [local]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2013-07-30 16:23:02 MDT [local]HINT: In a moment you should be able to reconnect to the database and repeat your command.

2013-07-30 16:23:02 MDT [local]WARNING: terminating connection because of crash of another server process

2013-07-30 16:23:02 MDT [local]DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.


2013-07-30 16:23:04 MDT [local]FATAL: the database system is in recovery mode

2013-07-30 16:23:04 MDT LOG: archiver process (PID 1826) exited with exit code 1

2013-07-30 16:23:04 MDT 192.168.131.2FATAL: the database system is in recovery mode

2013-07-30 16:23:04 MDT LOG: all server processes terminated; reinitializing

2013-07-30 16:23:04 MDT LOG: database system was interrupted; last known up at 2013-07-30 16:21:33 MDT

2013-07-30 16:23:04 MDT LOG: database system was not properly shut down; automatic recovery in progress

2013-07-30 16:23:04 MDT LOG: consistent recovery state reached at 1179D/8B7E7EF8

2013-07-30 16:23:04 MDT LOG: redo starts at 1179A/C1001EA8


2013-07-30 16:26:48 MDT LOG: record with zero length at 1179D/AC2591A8

2013-07-30 16:26:48 MDT LOG: redo done at 1179D/AC259168

2013-07-30 16:26:48 MDT LOG: last completed transaction was at log time 2013-07-30 16:23:02.11493-06

2013-07-30 16:26:48 MDT WARNING: page 476 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 493 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1023 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 708 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1075 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 590 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 832 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1742 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 238 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 334 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1131 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 434 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 772 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 259 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 498 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 948 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 1743 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 96 of relation base/603188/199093492 did not exist

2013-07-30 16:26:48 MDT WARNING: page 559 of relation base/603188/199093492 did not exist


2013-07-30 16:26:48 MDT PANIC: WAL contains references to invalid pages

2013-07-30 16:26:48 MDT LOG: startup process (PID 17546) was terminated by signal 6: Abort trap

2013-07-30 16:26:48 MDT LOG: aborting startup due to startup process failure




Slave log:

2013-07-30 16:41:48 MDT FATAL: could not connect to the primary server: could not connect to server: Operation timed out

Is the server running on host "192.168.131.1" and accepting

TCP/IP connections on port 5432?


2013-07-30 16:42:59 MDT LOG: trigger file found: /pgdata/data/failover

2013-07-30 16:42:59 MDT FATAL: terminating walreceiver process due to administrator command

2013-07-30 16:42:59 MDT LOG: redo done at 1179D/AC11DF00

2013-07-30 16:42:59 MDT LOG: last completed transaction was at log time 2013-07-30 16:23:01.986951-06

2013-07-30 16:42:59 MDT LOG: selected new timeline ID: 2

2013-07-30 16:42:59 MDT LOG: archive recovery complete

2013-07-30 16:42:59 MDT WARNING: page 98 of relation base/603188/4268050827 did not exist

2013-07-30 16:42:59 MDT WARNING: page 476 of relation base/603188/199093492 did not exist

2013-07-30 16:42:59 MDT WARNING: page 571 of relation base/603188/2775093183 did not exist

2013-07-30 16:42:59 MDT WARNING: page 202 of relation base/603188/4268050827 did not exist

2013-07-30 16:42:59 MDT WARNING: page 202 of relation base/603188/3435025974 did not exist

2013-07-30 16:42:59 MDT WARNING: page 493 of relation base/603188/199093492 did not exist

2013-07-30 16:42:59 MDT WARNING: page 1023 of relation base/603188/199093492 did not exist

2013-07-30 16:42:59 MDT WARNING: page 163 of relation base/603188/3476677873 did not exist

2013-07-30 16:42:59 MDT WARNING: page 15 of relation base/603188/3435025974 did not exist


2013-07-30 16:42:59 MDT PANIC: WAL contains references to invalid pages

2013-07-30 16:43:00 MDT LOG: startup process (PID 41056) was terminated by signal 6: Abort trap

2013-07-30 16:43:00 MDT LOG: terminating any other active server processes

2013-07-30 16:43:00 MDT 10.254.254.23WARNING: terminating connection because of crash of another server process

2013-07-30 16:43:00 MDT 10.254.254.23DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.









В списке pgsql-admin по дате отправления:

Предыдущее
От: bricklen
Дата:
Сообщение: Re: Disk latency goes up during certaing pediods
Следующее
От: Carlos Henrique Reimer
Дата:
Сообщение: Exit code -1073741819