Re: corrupt pages detected by enabling checksums

Поиск
Список
Период
Сортировка
От Jeff Janes
Тема Re: corrupt pages detected by enabling checksums
Дата
Msg-id CAMkU=1zX8vL8_HmJPa61XBp5uTQwEBaKoz93O1zM98x4g4rKTw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: corrupt pages detected by enabling checksums  (Greg Stark <stark@mit.edu>)
Ответы Re: corrupt pages detected by enabling checksums
Re: corrupt pages detected by enabling checksums
Список pgsql-hackers
On Fri, May 10, 2013 at 9:54 AM, Greg Stark <stark@mit.edu> wrote:
On Fri, May 10, 2013 at 5:31 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
> In the case where one block is missing, how can it even reach to next record
> to check "prev" pointer.
> I think it can be possible when one of the record is corrupt and following
> are okay which I think is the
> case in which it can proceed with warning as suggested by Simon.

A single WAL record can be over 24kB. The checksum covers the entire
WAL record and if it reports corruption it can be because a chunk in
the middle wasn't flushed to disk before the system crashed. The
beginning of the WAL record with the length and checksum and the
entire following record with its prev pointer might have been flushed
but the missing block in the middle of this record means it can't be
replayed. This would be a normal situation in case of a system crash.

If you replayed the following record but not this record you would
have an inconsistent database.

I don't think we would ever want to *skip* the record and play the next one.  But if it looks like the next record is valid, we might not want to automatically open the database in a possibly inconsistent state and in the process overwrite the only existing copy of those WAL records which would be necessary to make it consistent.  Instead, could we present the DBA with an explicit choice to either open the database, or try to reconstruct the corrupted record via forensic inspection so that it can be played through (I have no idea how likely it is that such an attempt would succeed), or to copy the database for later inspection and then open it.

But based on your description, perhaps refusing to automatically restart and forcing an explicit decision would happen a lot more often, during normal crashes with no corruption, than I was thinking it would.

Of course the paranoid DBA could turn off restart_after_crash and do a manual investigation on every crash, but in that case the database would refuse to restart even in the case where it perfectly clear that all the following WAL belongs to the recycled file and not the current file.  They would also have to turn off any startup scripts in init.d, to make sure a rebooting server doesn't do recovery automatically and destroy evidence that way.


Cheers,

Jeff

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Simon Riggs
Дата:
Сообщение: Re: corrupt pages detected by enabling checksums
Следующее
От: Marko Kreen
Дата:
Сообщение: Re: pgcrypto: Fix RSA password-protected keys