Re: CRCs (was: beta testing version)

Поиск
Список
Период
Сортировка
От ncm@zembu.com (Nathan Myers)
Тема Re: CRCs (was: beta testing version)
Дата
Msg-id 20001207122541.A30335@store.zembu.com
обсуждение исходный текст
Ответ на Re: CRCs (was: beta testing version)  (Bruce Guenter <bruceg@em.ca>)
Ответы Re: CRCs (was: beta testing version)  (Bruce Guenter <bruceg@em.ca>)
Список pgsql-hackers
On Wed, Dec 06, 2000 at 06:53:37PM -0600, Bruce Guenter wrote:
> On Wed, Dec 06, 2000 at 11:08:00AM -0800, Nathan Myers wrote:
> > On Wed, Dec 06, 2000 at 11:49:10AM -0600, Bruce Guenter wrote:
> > > 
> > > I don't know how pgsql does it, but the only safe way I know of
> > > is to include an "end" marker after each record.
> > 
> > An "end" marker is not sufficient, unless all writes are done in
> > one-sector units with an fsync between, and the drive buffering 
> > is turned off.
> 
> That's why an end marker must follow all valid records.  When you write
> records, you don't touch the marker, and add an end marker to the end of
> the records you've written.  After writing and syncing the records, you
> rewrite the end marker to indicate that the data following it is valid,
> and sync again.  There is no state in that sequence in which partially-
> written data could be confused as real data, assuming either your drives
> aren't doing write-back caching or you have a UPS, and fsync doesn't
> return until the drives return success.

That requires an extra out-of-sequence write. 

> > > Any other way I've seen discussed (here and elsewhere) either
> > > - Assume that a CRC is a guarantee.  
> > 
> > We are already assuming a CRC is a guarantee.  
> >
> > The drive computes a CRC for each sector, and if the CRC is OK the 
> > drive is happy.  CRC errors within the drive are quite frequent, and 
> > the drive re-reads when a bad CRC comes up.
> 
> The kind of data failures that a CRC is guaranteed to catch (N-bit
> errors) are almost precisely those that a mis-read on a hardware sector
> would cause.

They catch a single mis-read, but not necessarily the quite likely
double mis-read.

> > >   ... A CRC would be a good addition to
> > >   help ensure the data wasn't broken by flakey drive firmware, but
> > >   doesn't guarantee consistency.
> > No, a CRC would be a good addition to compensate for sector write
> > reordering, which is done both by the OS and by the drive, even for 
> > "atomic" writes.
> 
> But it doesn't guarantee consistency, even in that case.  There is a
> possibility (however small) that the random data that was located in 
> the sectors before the write will match the CRC.

Generally, there are no guarantees, only reasonable expectations.  A 
64-bit CRC would give sufficient confidence without the out-of-sequence
write, and also detect corruption from any source including power outage.

(I'd also like to see CRCs on all the table blocks as well; is there
a place to put them?)

Nathan Myers
ncm@zembu.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Mikheev, Vadim"
Дата:
Сообщение: RE: Switch pg_ctl's default about waiting?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: beta testing version