Re: CRCs (was: beta testing version)
От | ncm@zembu.com (Nathan Myers) |
---|---|
Тема | Re: CRCs (was: beta testing version) |
Дата | |
Msg-id | 20001207122541.A30335@store.zembu.com обсуждение исходный текст |
Ответ на | Re: CRCs (was: beta testing version) (Bruce Guenter <bruceg@em.ca>) |
Ответы |
Re: CRCs (was: beta testing version)
(Bruce Guenter <bruceg@em.ca>)
|
Список | pgsql-hackers |
On Wed, Dec 06, 2000 at 06:53:37PM -0600, Bruce Guenter wrote: > On Wed, Dec 06, 2000 at 11:08:00AM -0800, Nathan Myers wrote: > > On Wed, Dec 06, 2000 at 11:49:10AM -0600, Bruce Guenter wrote: > > > > > > I don't know how pgsql does it, but the only safe way I know of > > > is to include an "end" marker after each record. > > > > An "end" marker is not sufficient, unless all writes are done in > > one-sector units with an fsync between, and the drive buffering > > is turned off. > > That's why an end marker must follow all valid records. When you write > records, you don't touch the marker, and add an end marker to the end of > the records you've written. After writing and syncing the records, you > rewrite the end marker to indicate that the data following it is valid, > and sync again. There is no state in that sequence in which partially- > written data could be confused as real data, assuming either your drives > aren't doing write-back caching or you have a UPS, and fsync doesn't > return until the drives return success. That requires an extra out-of-sequence write. > > > Any other way I've seen discussed (here and elsewhere) either > > > - Assume that a CRC is a guarantee. > > > > We are already assuming a CRC is a guarantee. > > > > The drive computes a CRC for each sector, and if the CRC is OK the > > drive is happy. CRC errors within the drive are quite frequent, and > > the drive re-reads when a bad CRC comes up. > > The kind of data failures that a CRC is guaranteed to catch (N-bit > errors) are almost precisely those that a mis-read on a hardware sector > would cause. They catch a single mis-read, but not necessarily the quite likely double mis-read. > > > ... A CRC would be a good addition to > > > help ensure the data wasn't broken by flakey drive firmware, but > > > doesn't guarantee consistency. > > No, a CRC would be a good addition to compensate for sector write > > reordering, which is done both by the OS and by the drive, even for > > "atomic" writes. > > But it doesn't guarantee consistency, even in that case. There is a > possibility (however small) that the random data that was located in > the sectors before the write will match the CRC. Generally, there are no guarantees, only reasonable expectations. A 64-bit CRC would give sufficient confidence without the out-of-sequence write, and also detect corruption from any source including power outage. (I'd also like to see CRCs on all the table blocks as well; is there a place to put them?) Nathan Myers ncm@zembu.com
В списке pgsql-hackers по дате отправления: