Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> As I've said before, I think we need to find a way to stop using sync()
>> altogether --- we have to move to fsync or O_SYNC and variants. sync
>> has simply got the wrong API.
> If sync failes (kernel to disk write failes) we have a hardware failure,
> and we don't pretend to recover from that,
Not necessarily --- it could be out-of-disk-space, on at least some
filesystems. More to the point, the important thing is not to commit a
checkpoint record to WAL indicating that everything is good, when
everything is not good. As long as we don't checkpoint we have some
hope of recovering automatically via WAL replay.
> One idea I floated around was to
> open/write/fsync/close a temporary file after sync in the hope that it
> would happen after the sync completes because the fsync would be at the
> end of the disk flush queue.
"In the hope"? We already have a guess-and-hope approach to this, and
it will never be any better as long as we use sync(), because sync() is
fundamentally the wrong operation. It doesn't tell you when the I/O is
done, and it doesn't tell you whether the I/O was done successfully, and
there is no possibility of working around that fundamental lack of
information except to stop using it.
regards, tom lane