Tracking down log segment corruption

Поиск

Список

Период

Сортировка

От	Charles Duffy
Тема	Tracking down log segment corruption
Дата	21 декабря 2008 г. 23:40:14
Msg-id	gimm8t$lh1$1@ger.gmane.org обсуждение исходный текст
Ответы	Re: Tracking down log segment corruption
Список	pgsql-general

Дерево обсуждения

Howdy, all.

I have a log-shipping replication environment (based on PostgreSQL
8.3.4) using pg_lesslog+LZOP for compression of archived segments (kept
around long-term for possible use doing PITR). The slave came out of
synchronization recently, restoring a series of segments and then
failing with a SIGABORT (and doing the same after each automated restart):

[2-1] LOG:  starting archive recovery
[3-1] LOG:  restore_command = '/opt/extropy/postgres/bin/restore-segment
/exports/pgwal/segments.recoveryq %f %p %r'
[4-1] LOG:  restored log file "00000001000000140000000B" from archive
[5-1] LOG:  automatic recovery in progress
[6-1] LOG:  restored log file "000000010000001400000009" from archive
[7-1] LOG:  redo starts at 14/9127270
[8-1] LOG:  restored log file "00000001000000140000000A" from archive
[9-1] LOG:  restored log file "00000001000000140000000B" from archive
[10-1] LOG:  restored log file "00000001000000140000000C" from archive
[11-1] LOG:  restored log file "00000001000000140000000D" from archive
[12-1] LOG:  restored log file "00000001000000140000000E" from archive
[13-1] LOG:  restored log file "00000001000000140000000F" from archive
[14-1] LOG:  restored log file "000000010000001400000010" from archive
[15-1] WARNING:  specified item offset is too large
[15-2] CONTEXT:  xlog redo insert_upper: rel 1663/16384/17763; tid 2960/89
[16-1] PANIC:  btree_insert_redo: failed to add item
[16-2] CONTEXT:  xlog redo insert_upper: rel 1663/16384/17763; tid 2960/89
[1-1] LOG:  startup process (PID 17310) was terminated by signal 6: Aborted
[2-1] LOG:  aborting startup due to startup process failure

Replacing only 000000010000001400000010 with a pristine (never processed
with pg_compresslog) copy made no difference, but doing so for all of
the involved segments permitted the slave to pick up where it left off
and continue replaying.

It seems clear, then, that pg_lesslog was responsible in some way for
this corruption, and that there is some germane difference between the
original and { pg_compresslog | pg_decompresslog } version of one of the
involved segments. I've tried using xlogdump
[http://xlogviewer.projects.postgresql.org/] on both versions to look
for differing output, but it segfaults even with the "good" log segments.

Does anyone have suggestions as to how I should go about tracking down
the root cause of this issue?

Thanks!

В списке pgsql-general по дате отправления:

Предыдущее

От: Adrian Klaver
Дата: 21 декабря 2008 г., 17:28:52
Сообщение: Re: Copy/delete issue

Следующее

От: "Jonah H. Harris"
Дата: 22 декабря 2008 г., 00:46:22
Сообщение: Re: How are locks managed in PG?

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Tracking down log segment corruption

Предыдущее

Следующее