Re: Enable data checksums by default

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Enable data checksums by default
Дата
Msg-id 20190322164117.g2wmstoy6hbkyzfp@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: Enable data checksums by default  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: Enable data checksums by default
Список pgsql-hackers
Hi,

On 2019-03-22 17:32:10 +0100, Tomas Vondra wrote:
> On 3/22/19 5:10 PM, Andres Freund wrote:
> > IDK, being able to verify in some form that backups aren't corrupted on
> > an IO level is mighty nice. That often does allow to detect the issue
> > while one still has older backups around.
> > 
> 
> Yeah, I agree that's a valuable capability. I think the question is how
> effective it actually is considering how much the storage changed over
> the past few years (which necessarily affects the type of failures
> people have to deal with).

I'm not sure I understand? How do the changes around storage
meaningfully affect the need to have some trust in backups and
benefiting from earlier detection?


> It's not clear to me what can checksums do about zeroed pages (and/or
> truncated files) though.

Well, there's nothing fundamental about needing added pages be
zeroes. We could expand them to be initialized with actual valid
checksums instead of
        /* new buffers are zero-filled */
        MemSet((char *) bufBlock, 0, BLCKSZ);
        /* don't set checksum for all-zero page */
        smgrextend(smgr, forkNum, blockNum, (char *) bufBlock, false);

the problem is that it's hard to do so safely without adding a lot of
additional WAL logging. A lot of filesystems will journal metadata
changes (like the size of the file), but not contents. So after a crash
the tail end might appear zeroed out, even if we never wrote
zeroes. That's obviously solvable by WAL logging, but that's not cheap.

It might still be a good idea to just write a page with an initialized
header / checksum at that point, as that ought to still detect a number
of problems we can't detect right now.

Greetings,

Andres Freund


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Ordered Partitioned Table Scans
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: propagating replica identity to partitions