Greg Smith wrote:
> Scott Carey wrote:
>> For your database DATA disks, leaving the write cache on is 100%
>> acceptable,
>> even with power loss, and without a RAID controller. And even in
>> high write
>> environments.
>>
>> That is what the XLOG is for, isn't it? That is where this behavior is
>> critical. But that has completely different performance requirements
>> and
>> need not bee on the same volume, array, or drive.
>>
> At checkpoint time, writes to the main data files are done that are
> followed by fsync calls to make sure those blocks have been written to
> disk. Those writes have exactly the same consistency requirements as
> the more frequent pg_xlog writes. If the drive ACKs the write, but
> it's not on physical disk yet, it's possible for the checkpoint to
> finish and the underlying pg_xlog segments needed to recover from a
> crash at that point to be deleted. The end of the checkpoint can wipe
> out many WAL segments, presuming they're not needed anymore because
> the data blocks they were intended to fix during recovery are now
> guaranteed to be on disk.
Guys, read that again.
IF THE DISK OR DRIVER ACK'S A FSYNC CALL THE WAL ENTRY IS LIKELY GONE,
AND YOU ARE SCREWED IF THE DATA IS NOT REALLY ON THE DISK.
-- Karl