> > Even with true fdatasync it's not obviously good for performance - it takes
> > too long time to write 16Mb files and fills OS buffer cache
> with trash-:(
> >>
> >> True. But at least the write is (hopefully) being done at a
> >> non-performance-critical time.
>
> > So you have non critical time every five minutes ?
> > Those platforms that don't have fdatasync won't profit anyway.
>
> Yes they will; you're forgetting the cost of updating
> filesystem overhead.
I did have that in mind, but I thought that in effect the OS would
optimize sparse file allocation somehow.
Doing some tests however showed that while your variant is really good
and saves 12 seconds, the performance is *very* poor for eighter variant.
A short test shows, that opening the file O_SYNC, and thus avoiding fsync()
would cut the effective time needed to sync write the xlog more than in half.
Of course we would need to buffer >= 1 xlog page before write (or commit)
to gain the full advantage.
prewrite 0 + write and fsync: 60.4 sec
sparse file + write with O_SYNC: 37.5 sec
no prewrite + write with O_SYNC: 36.8 sec
prewrite 0 + write with O_SYNC: 24.0 sec
These times include the prewrite when applicable on AIX with jfs.
Testprogram attached. I may be overseeing something, though.
Andreas