Обсуждение: Re: [pgsql-hackers-win32] win32 performance - fsync question
> > > > * Win32, with fsync, write-cache disabled: no data corruption > > > > * Win32, with fsync, write-cache enabled: no data corruption > > > > * Win32, with osync, write cache disabled: no data corruption > > > > * Win32, with osync, write cache enabled: no data > corruption. Once > > > > I > > > > got: > > > > 2005-02-24 12:19:54 LOG: could not open file "C:/Program > > > > Files/PostgreSQL/8.0/data/pg_xlog/000000010000000000000010" > > > (log file > > > > 0, segment 16): No such file or directory > > > > but the data in the database was consistent. > > > > > > It disturbs me that you couldn't produce data corruption in the > > > cases where it theoretically should occur. Seems like this is an > > > indication that your test was insufficiently severe, or > that there > > > is something going on we don't understand. > > > > The Windows driver knows abotu the write cache, and at > least fsync() > > pushes through the write cache even if it's there. This seems to > > indicate taht O_SYNC at least partiallyi does this as well. This is > > why there is no performance difference at all on fsync() with write > > cache on or off. > > > > I don't know if this is true for all IDE disks. COuld be > that my disk > > is particularly well-behaved. > > This indicated to me that open_sync did not require any > additional changes than our current fsync. fsync and open_sync both write through the write cache in the operating system. Only fsync=off turns this off. fsync also writes through the hardware write cache. o_sync does not. This is what causes the large slowdown with write cache enabled, *including* most battery backed write cache systems (pretty much making the write-cache a waste of money). This may be a good thing on IDE systems (for admins that don't know how to remove the little check in the box for "enable write caching on the disk" that MS provides, which *explicitly* warns that you may lose data if you enabled it), but it's a very bad thing for anything higher end. fsync also syncs the directory metadata. o_sync only cares about the files contents. (This is what causes the large slowdown with write cache *disabled*, becuase it requires multiple writes on multiple disk locations for each fsync). Basically, fsync hurts people who configure their box correctly, or who use things like SCSI disks. o_sync hurts people who configure their machine in an unsafe way. //Magnus
Magnus Hagander wrote:
> > This indicated to me that open_sync did not require any
> > additional changes than our current fsync.
>
> fsync and open_sync both write through the write cache in the operating
> system. Only fsync=off turns this off.
>
> fsync also writes through the hardware write cache. o_sync does not.
> This is what causes the large slowdown with write cache enabled,
> *including* most battery backed write cache systems (pretty much making
> the write-cache a waste of money). This may be a good thing on IDE
> systems (for admins that don't know how to remove the little check in
> the box for "enable write caching on the disk" that MS provides, which
> *explicitly* warns that you may lose data if you enabled it), but it's a
> very bad thing for anything higher end.
I found the checkbox on XP looking at "Properties" for the drive, then
choosing "Hardware", the drive, "Properties", and "Policies".
> fsync also syncs the directory metadata. o_sync only cares about the
> files contents. (This is what causes the large slowdown with write cache
> *disabled*, because it requires multiple writes on multiple disk
> locations for each fsync).
>
> Basically, fsync hurts people who configure their box correctly, or who
> use things like SCSI disks. o_sync hurts people who configure their
> machine in an unsafe way.
So, it seems that Win32 open_sync is exactly the same as our
"wal_sync_method = open_datasync" on Unix (it needs to be renamed), and
"wal_sync_method = fsync" on Win32 is something we don't have that
writes through the disk write cache even if it is enabled.
I have developed the following patch which renames our wal_sync_method
Win32 support from open_sync to open_datasync:
ftp://candle.pha.pa.us/pub/postgresql/mypatches
One issue with this patch is that if applied it would make open_datasync
the default sync method on Win32 because we prefer open_datasync over
all other sync methods. If we don't want to do that, I think we should
still do the rename for accuracy and add a !WIN32 test to prevent
open_datasync from being the default.
However, I do prefer this patch and let Win32 have the same write cache
issues as Unix, for consistency.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> However, I do prefer this patch and let Win32 have the same write cache
> issues as Unix, for consistency.
I agree that the open flag is more nearly O_DSYNC than O_SYNC.
ISTM Windows' idea of fsync is quite different from Unix's and therefore
we should name the wal_sync_method that invokes it something different
than fsync. "write_through" or some such? We already have precedent
that not all wal_sync_method values are available on all platforms.
I'm not taking a position on which the default should be ...
regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > However, I do prefer this patch and let Win32 have the same write cache > > issues as Unix, for consistency. > > I agree that the open flag is more nearly O_DSYNC than O_SYNC. > > ISTM Windows' idea of fsync is quite different from Unix's and therefore > we should name the wal_sync_method that invokes it something different > than fsync. "write_through" or some such? We already have precedent > that not all wal_sync_method values are available on all platforms. > > I'm not taking a position on which the default should be ... Yes, I am thinking that too. I hesistated because it adds yet another sync method, and we have to document it works only on Win32, but I see no better solution. I am going to let the Win32 users mostly vote on what the default should be. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> we should name the wal_sync_method that invokes it something different
>> than fsync. "write_through" or some such? We already have precedent
>> that not all wal_sync_method values are available on all platforms.
> Yes, I am thinking that too. I hesistated because it adds yet another
> sync method, and we have to document it works only on Win32, but I see
> no better solution.
It occurs to me that it'd probably be a good idea if the error message
for an unsupported wal_sync_method value explicitly listed the allowed
values for the platform. If there's no objection I'll try to make
that happen. (I'm not sure if it's trivial or not: I think the GUC
framework is a bit restrictive about custom error messages from GUC
assign hooks...)
regards, tom lane