Re: Tracking down log segment corruption

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Tracking down log segment corruption
Дата
Msg-id 17352.1272834971@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Tracking down log segment corruption  (Gordon Shannon <gordo169@gmail.com>)
Ответы Re: Tracking down log segment corruption  (Gordon Shannon <gordo169@gmail.com>)
Список pgsql-general
Gordon Shannon <gordo169@gmail.com> writes:
> [ corruption on a standby slave after an ALTER SET TABLESPACE operation ]

Found it, I think.  ATExecSetTableSpace transfers the copied data to the
slave by means of XLOG_HEAP_NEWPAGE WAL records.  The replay function
for this (heap_xlog_newpage) is failing to pay any attention to the
forkNum field of the WAL record.  This means it will happily write FSM
and visibility-map pages into the main fork of the relation.  So if the
index had any such pages on the master, it would immediately become
corrupted on the slave.  Now indexes don't have a visibility-map fork,
but they could have FSM pages.  And an FSM page would have the right
header information to look like an empty index page.  So dropping an
index FSM page into the main fork of the index would produce the
observed symptom.

I'm not 100% sure that this is what bit you, but it's clearly a bug and
AFAICS it could produce the observed symptoms.

This is a seriously, seriously nasty data corruption bug.  The only bit
of good news is that ALTER SET TABLESPACE seems to be the only operation
that can emit XLOG_HEAP_NEWPAGE records with forkNum different from
MAIN_FORKNUM, so that's the only operation that's at risk.  But if you
do do that, not only are standby slaves going to get clobbered, but the
master could get corrupted too if you were unlucky enough to have a
crash and replay from WAL shortly after completing the ALTER.  And it's
not only indexes that are at risk --- tables could get clobbered the
same way.

My crystal ball says there will be update releases in the very near
future.

            regards, tom lane

В списке pgsql-general по дате отправления:

Предыдущее
От: Gordon Shannon
Дата:
Сообщение: Re: Tracking down log segment corruption
Следующее
От: Gordon Shannon
Дата:
Сообщение: Re: Tracking down log segment corruption