Re: Enabling Checksums

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Enabling Checksums
Дата
Msg-id 5139A377.1040905@vmware.com
обсуждение исходный текст
Ответ на Re: Enabling Checksums  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: Enabling Checksums  (Greg Smith <greg@2ndQuadrant.com>)
Список pgsql-hackers
On 08.03.2013 05:31, Bruce Momjian wrote:
> Also, don't all modern storage drives have built-in checksums, and
> report problems to the system administrator?  Does smartctl help report
> storage corruption?
>
> Let me take a guess at answering this --- we have several layers in a
> database server:
>
>     1 storage
>     2 storage controller
>     3 file system
>     4 RAM
>     5 CPU
>
> My guess is that storage checksums only cover layer 1, while our patch
> covers layers 1-3, and probably not 4-5 because we only compute the
> checksum on write.

There is a thing called "Data Integrity Field" and/or "Data Integrity 
Extensions", that allow storing a checksum with each disk sector, and 
verifying the checksum in each layer. The basic idea is that instead of 
512 byte sectors, the drive is formatted to use 520 byte sectors, with 
the extra 8 bytes used for the checksum and some other metadata. That 
gets around the problem we have in PostgreSQL, and that filesystems 
have, which is that you need to store the checksum somewhere along with 
the data.

When a write I/O request is made in the OS, the OS calculates the 
checksum and passes it to through the controller to the drive. The drive 
verifies the checksum, and aborts the I/O request if it doesn't match. 
On a read, the checksum is read from the drive along with the actual 
data, passed through the controller, and the OS verifies it. This covers 
layers 1-2 or 1-3.

Now, this requires all the components to have support for that. I'm not 
an expert on these things, but I'd guess that that's a tall order today. 
I don't know which hardware vendors and kernel versions support that. 
But things usually keep improving, and hopefully in a few years, you can 
easily buy a hardware stack that supports DIF all the way through.

In theory, the OS could also expose the DIF field to the application, so 
that you get end-to-end protection from the application to the disk. 
This means that the application somehow gets access to those extra bytes 
in each sector, and you have to calculate and verify the checksum in the 
application. There are no standard APIs for that yet, though.

See https://www.kernel.org/doc/Documentation/block/data-integrity.txt.

- Heikki



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Kyotaro HORIGUCHI
Дата:
Сообщение: Re: 9.2.3 crashes during archive recovery
Следующее
От: Heikki Linnakangas
Дата:
Сообщение: Re: Enabling Checksums