Re: [BUG?] lag of minRecoveryPont in archive recovery

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: [BUG?] lag of minRecoveryPont in archive recovery
Дата
Msg-id CAHGQGwEi6Q6h65nVzBngaCR=Vse03-io2_7VFrC9dWKnNoa15w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [BUG?] lag of minRecoveryPont in archive recovery  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Ответы Re: [BUG?] lag of minRecoveryPont in archive recovery
Список pgsql-hackers
On Tue, Dec 11, 2012 at 1:33 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
> On 10.12.2012 13:50, Heikki Linnakangas wrote:
>>
>> So I'd say we should update minRecoveryPoint first, then
>> truncate/delete. But we should still keep the XLogFlush() at the end of
>> xact_redo_commit_internal(), for the case where files/directories are
>> created. Patch attached.

Sounds reasonable.

> Committed and backpatched that. Attached is a script I used to reproduce
> this problem, going back to 8.4.

Thanks!

Unfortunately I could reproduce the problem even after that commit.
Attached is the script I used to reproduce the problem.

The cause is that CheckRecoveryConsistency() is called before rm_redo(),
as Horiguchi-san suggested upthead. Imagine the case where
minRecoveryPoint is set to the location of the XLOG_SMGR_TRUNCATE
record. When restarting the server with that minRecoveryPoint,
the followings would happen, and then PANIC occurs.

1. XLOG_SMGR_TRUNCATE record is read.
2. CheckRecoveryConsistency() is called, and database is marked as
    consistent since we've reached minRecoveryPoint (i.e., the location
    of XLOG_SMGR_TRUNCATE).
3. XLOG_SMGR_TRUNCATE record is replayed, and invalid page is
    found.
4. Since the database has already been marked as consistent, an invalid
    page leads to PANIC.

Regards,

--
Fujii Masao

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Heikki Linnakangas
Дата:
Сообщение: Shuffling xlog header files
Следующее
От: "Karl O. Pinc"
Дата:
Сообщение: Re: [PATCH] PL/Python: Add spidata to all spiexceptions