Re: Detecting corrupted pages earlier

Поиск
Список
Период
Сортировка
От Greg Copeland
Тема Re: Detecting corrupted pages earlier
Дата
Msg-id 1045598835.3290.2.camel@mouse.copelandconsulting.net
обсуждение исходный текст
Ответ на Re: Detecting corrupted pages earlier  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
On Mon, 2003-02-17 at 22:04, Tom Lane wrote:
> Curt Sampson <cjs@cynic.net> writes:
> > On Mon, 17 Feb 2003, Tom Lane wrote:
> >> Postgres has a bad habit of becoming very confused if the page header of
> >> a page on disk has become corrupted.
> 
> > What typically causes this corruption?
> 
> Well, I'd like to know that too.  I have seen some cases that were
> identified as hardware problems (disk wrote data to wrong sector, RAM
> dropped some bits, etc).  I'm not convinced that that's the whole story,
> but I have nothing to chew on that could lead to identifying a software
> bug.
> 
> > If it's any kind of a serious problem, maybe it would be worth keeping
> > a CRC of the header at the end of the page somewhere.
> 
> See past discussions about keeping CRCs of page contents.  Ultimately
> I think it's a significant expenditure of CPU for very marginal returns
> --- the layers underneath us are supposed to keep their own CRCs or
> other cross-checks, and a very substantial chunk of the problem seems
> to be bad RAM, against which occasional software CRC checks aren't 
> especially useful.

This is exactly why "magic numbers" or simple algorithmic bit patterns
are commonly used.  If the "magic number" or bit pattern doesn't match
it's page number accordingly, you know something is wrong.  Storage cost
tends to be slightly and CPU overhead low.

I agree with you that a CRC is seems overkill for little return.

Regards,

-- 
Greg Copeland <greg@copelandconsulting.net>
Copeland Computer Consulting



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Mikheev, Vadim"
Дата:
Сообщение: Re: WAL replay logic (was Re: [PERFORM] Mount options f
Следующее
От: "Sumaira Ali"
Дата:
Сообщение: PGRPROC