Ron Mayer wrote:
> Bruce Momjian wrote:
> > Greg Smith wrote:
> >> A good test program that is a bit better at introducing and detecting
> >> the write cache issue is described at
> >> http://brad.livejournal.com/2116715.html
> >
> > Wow, I had not seen that tool before. I have added a link to it from
> > our documentation, and also added a mention of our src/tools/fsync test
> > tool to our docs.
>
> One challenge with many of these test programs is that some
> filesystem (ext3 is one) will flush drive caches on fsync()
> *sometimes, but not always. If your test program happens to do
> a sequence of commands that makes an fsync() actually flush a
> disk's caches, it might mislead you if your actual application
> has a different series of system calls.
>
> For example, ext3 fsync() will issue write barrier commands
> if the inode was modified; but not if the inode wasn't.
>
> See test program here:
> http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg272253.html
> and read two paragraphs further to see how touching
> the inode makes ext3 fsync behave differently.
I thought our only problem was testing the I/O subsystem --- I never
suspected the file system might lie too. That email indicates that a
large percentage of our install base is running on unreliable file
systems --- why have I not heard about this before? Do the write
barriers allow data loss but prevent data inconsistency? It sound like
they are effectively running with synchronous_commit = off.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +