Re: FSM corruption leading to errors

Поиск
Список
Период
Сортировка
От Pavan Deolasee
Тема Re: FSM corruption leading to errors
Дата
Msg-id CABOikdO=Tryjc9CiKBdbXP3KjcRGZgNY9mT=AKLFszUCRpEgQw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: FSM corruption leading to errors  (Heikki Linnakangas <hlinnaka@iki.fi>)
Ответы Re: FSM corruption leading to errors  (Heikki Linnakangas <hlinnaka@iki.fi>)
Список pgsql-hackers


On Wed, Oct 19, 2016 at 2:37 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:


Actually, this is still not 100% safe. Flushing the WAL before modifying the FSM page is not enough. We also need to WAL-log a full-page image of the FSM page, otherwise we are still vulnerable to the torn page problem.

I came up with the attached. This is fortunately much simpler than my previous attempt. I replaced the MarkBufferDirtyHint() calls with MarkBufferDirty(), to fix the original issue, plus WAL-logging a full-page image to fix the torn page issue.


Looks good to me.
 
BTW any thoughts on race-condition on the primary? Comments at
MarkBufferDirtyHint() seems to suggest that a race condition is possible
which might leave the buffer without the DIRTY flag, but I'm not sure if
that can only happen when the page is locked in shared mode.

I think the race condition can only happen when the page is locked in shared mode. In any case, with this proposed fix, we'll use MarkBufferDirty() rather than MarkBufferDirtyHint(), so it's moot.


Yes, the fix will cover that problem (if it exists). The reason why I was curious to know is because there are several reports of similar error in the past and some of them did not involve as standby. Those reports mostly remained unresolved and I wondered if this explains them. But yeah, my conclusion was that the race is not possible with page locked in EXCLUSIVE mode. So may be there is another problem somewhere or a crash recovery may have left the FSM in inconsistent state.

Anyways, we seem good to go with the patch.

Thanks,
Pavan
--
 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: Indirect indexes
Следующее
От: Greg Stark
Дата:
Сообщение: Re: LLVM Address Sanitizer (ASAN) and valgrind support