Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
Дата
Msg-id 200409131438.i8DEc8r04384@candle.pha.pa.us
обсуждение исходный текст
Ответ на Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options  (mudfoot@rawbw.com)
Ответы Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
Список pgsql-performance
Have you seen /src/tools/fsync?

---------------------------------------------------------------------------

mudfoot@rawbw.com wrote:
> Hi, I'd like to help with the topic in the Subject: line.  It seems to be a
> TODO item.  I've reviewed some threads discussing the matter, so I hope I've
> acquired enough history concerning it.  I've taken an initial swipe at
> figuring out how to optimize sync'ing methods.  It's based largely on
> recommendations I've read on previous threads about fsync/O_SYNC and so on.
> After reviewing, if anybody has recommendations on how to proceed then I'd
> love to hear them.
>
> Attached is a little program that basically does a bunch of sequential writes
> to a file.  All of the sync'ing methods supported by PostgreSQL WAL can be
> used.  Results are printed in microseconds.  Size and quanity of writes are
> configurable.  The documentation is in the code (how to configure, build, run,
> etc.).  I realize that this program doesn't reflect all of the possible
> activities of a production database system, but I hope it's a step in the
> right direction for this task.  I've used it to see differences in behavior
> between the various sync'ing methods on various platforms.
>
> Here's what I've found running the benchmark on some systems to which
> I have access.  The differences in behavior between platforms is quite vast.
>
> Summary first...
>
> <halfjoke>
> PostgreSQL should be run on an old Apple MacIntosh attached to
> its own Hitachi disk array with 2GB cache or so.  Use any sync method
> except for fsync().
> </halfjoke>
>
> Anyway, there is *a lot* of variance in file synching behavior across
> different hardware and O/S platforms.  It's probably not safe
> to conclude much.  That said, here are some findings so far based on
> tests I've run:
>
> 1.  under no circumstances do fsync() or fdatasync() seem to perform
> better than opening files with O_SYNC or O_DSYNC
> 2.  where there are differences, opening files with O_SYNC or O_DSYNC
> tends to be quite faster.
> 3.  fsync() seems to be the slowest where there are differences.  And
> O_DSYNC seems to be the fastest where results differ.
> 4.  the safest thing to assert at this point is that
> Solaris systems ought to use the O_DSYNC method for WAL.
>
> -----------
>
> Test system(s)
>
> Athlon Linux:
> AMD Athlon XP2000, 512MB RAM, single (54 or 7200?) RPM 20GB IDE disk,
> reiserfs filesystem (3 something I think)
> SuSE Linux kernel 2.4.21-99
>
> Mac Linux:
> I don't know the specific model.  400MHz G3, 512MB, single IDE disk,
> ext2 filesystem
> Debian GNU/Linux 2.4.16-powerpc
>
> HP Intel Linux:
> Prolient HPDL380G3, 2 x 3GHz Xeon, 2GB RAM, SmartArray 5i 64MB cache,
> 2 x 15,000RPM 36GB U320 SCSI drives mirrored.  I'm not sure if
> writes are cached or not.  There's no battery backup.
> ext3 filesystem.
> Redhat Enterprise Linux 3.0 kernel based on 2.4.21
>
> Dell Intel OpenBSD:
> Poweredge ?, single 1GHz PIII, 128MB RAM, single 7200RPM 80GB IDE disk,
> ffs filesystem
> OpenBSD 3.2 GENERIC kernel
>
> SUN Ultra2:
> Ultra2, 2 x 296MHz UltraSPARC II, 2GB RAM, 2 x 10,000RPM 18GB U160
> SCSI drives mirrored with Solstice DiskSuite.  UFS filesystem.
> Solaris 8.
>
> SUN E4500 + HDS Thunder 9570v
> E4500, 8 x 400MHz UltraSPARC II, 3GB RAM,
> HDS Thunder 9570v, 2GB mirrored battery-backed cache, RAID5 with a
> bunch of 146GB 10,000RPM FC drives.  LUN is on single 2GB FC fabric
> connection.
> Veritas filesystem (VxFS)
> Solaris 8.
>
> Test methodology:
>
> All test runs were done with CHUNKSIZE 8 * 1024, CHUNKS 2 * 1024,
> FILESIZE_MULTIPLIER 2, and SLEEP 5.  So a total of 16MB was sequentially
> written for each benchmark.
>
> Results are in microseconds.
>
> PLATFORM:       Athlon Linux
> buffered:       48220
> fsync:          74854397
> fdatasync:      75061357
> open_sync:      73869239
> open_datasync:  74748145
> Notes:  System mostly idle.  Even during tests, top showed about 95%
> idle.  Something's not right on this box.  All sync methods similarly
> horrible on this system.
>
> PLATFORM:       Mac Linux
> buffered:       58912
> fsync:          1539079
> fdatasync:      769058
> open_sync:      767094
> open_datasync:  763074
> Notes: system mostly idle.  fsync seems worst.  Otherwise, they seem
> pretty equivalent.  This is the fastest system tested.
>
> PLATFORM:       HP Intel Linux
> buffered:       33026
> fsync:          29330067
> fdatasync:      28673880
> open_sync:      8783417
> open_datasync:  8747971
> Notes: system idle.  O_SYNC and O_DSYNC methods seem to be a lot
> better on this platform than fsync & fdatasync.
>
> PLATFORM:       Dell Intel OpenBSD
> buffered:       511890
> fsync:          1769190
> fdatasync:      --------
> open_sync:      1748764
> open_datasync:  1747433
> Notes: system idle.  I couldn't locate fdatasync() on this box, so I
> couldn't test it.  All sync methods seem equivalent and are very fast --
> though still trail the old Mac.
>
> PLATFORM:       SUN Ultra2
> buffered:       1814824
> fsync:          73954800
> fdatasync:      52594532
> open_sync:      34405585
> open_datasync:  13883758
> Notes:  system mostly idle, with occasional spikes from 1-10% utilization.
> It looks like substantial difference between each sync method, with
> O_DSYNC the best and fsync() the worst.  There is substantial
> difference between the open* and f* methods.
>
> PLATFORM:       SUN E4500 + HDS Thunder 9570v
> buffered:       233947
> fsync:          57802065
> fdatasync:      56631013
> open_sync:      2362207
> open_datasync:  1976057
> Notes:  host about 30% idle, but the array tested on was completely idle.
> Something looks seriously not right about fsync and fdatasync -- write
> cache seems to have no effect on them.  As for write cache, that
> probably explains the 2 seconds or so for the open_sync and
> open_datasync methods.
>
> --------------
>
> Thanks for reading...I look forward to feedback, and hope to be helpful in
> this effort!
>
> Mark
>

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

В списке pgsql-performance по дате отправления:

Предыдущее
От: Pierre-Frédéric Caillaud
Дата:
Сообщение: Re: Help with extracting large volumes of records across related tables
Следующее
От: mudfoot@rawbw.com
Дата:
Сообщение: Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options