Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options

Поиск
Список
Период
Сортировка
От mudfoot@rawbw.com
Тема Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
Дата
Msg-id 1095055866.414539fadb90d@webmail.rawbw.com
обсуждение исходный текст
Ответы Re: Determine optimal fdatasync/fsync, O_SYNC/O_DSYNC options
Список pgsql-performance
Hi, I'd like to help with the topic in the Subject: line.  It seems to be a
TODO item.  I've reviewed some threads discussing the matter, so I hope I've
acquired enough history concerning it.  I've taken an initial swipe at
figuring out how to optimize sync'ing methods.  It's based largely on
recommendations I've read on previous threads about fsync/O_SYNC and so on.
After reviewing, if anybody has recommendations on how to proceed then I'd
love to hear them.

Attached is a little program that basically does a bunch of sequential writes
to a file.  All of the sync'ing methods supported by PostgreSQL WAL can be
used.  Results are printed in microseconds.  Size and quanity of writes are
configurable.  The documentation is in the code (how to configure, build, run,
etc.).  I realize that this program doesn't reflect all of the possible
activities of a production database system, but I hope it's a step in the
right direction for this task.  I've used it to see differences in behavior
between the various sync'ing methods on various platforms.

Here's what I've found running the benchmark on some systems to which
I have access.  The differences in behavior between platforms is quite vast.

Summary first...

<halfjoke>
PostgreSQL should be run on an old Apple MacIntosh attached to
its own Hitachi disk array with 2GB cache or so.  Use any sync method
except for fsync().
</halfjoke>

Anyway, there is *a lot* of variance in file synching behavior across
different hardware and O/S platforms.  It's probably not safe
to conclude much.  That said, here are some findings so far based on
tests I've run:

1.  under no circumstances do fsync() or fdatasync() seem to perform
better than opening files with O_SYNC or O_DSYNC
2.  where there are differences, opening files with O_SYNC or O_DSYNC
tends to be quite faster.
3.  fsync() seems to be the slowest where there are differences.  And
O_DSYNC seems to be the fastest where results differ.
4.  the safest thing to assert at this point is that
Solaris systems ought to use the O_DSYNC method for WAL.

-----------

Test system(s)

Athlon Linux:
AMD Athlon XP2000, 512MB RAM, single (54 or 7200?) RPM 20GB IDE disk,
reiserfs filesystem (3 something I think)
SuSE Linux kernel 2.4.21-99

Mac Linux:
I don't know the specific model.  400MHz G3, 512MB, single IDE disk,
ext2 filesystem
Debian GNU/Linux 2.4.16-powerpc

HP Intel Linux:
Prolient HPDL380G3, 2 x 3GHz Xeon, 2GB RAM, SmartArray 5i 64MB cache,
2 x 15,000RPM 36GB U320 SCSI drives mirrored.  I'm not sure if
writes are cached or not.  There's no battery backup.
ext3 filesystem.
Redhat Enterprise Linux 3.0 kernel based on 2.4.21

Dell Intel OpenBSD:
Poweredge ?, single 1GHz PIII, 128MB RAM, single 7200RPM 80GB IDE disk,
ffs filesystem
OpenBSD 3.2 GENERIC kernel

SUN Ultra2:
Ultra2, 2 x 296MHz UltraSPARC II, 2GB RAM, 2 x 10,000RPM 18GB U160
SCSI drives mirrored with Solstice DiskSuite.  UFS filesystem.
Solaris 8.

SUN E4500 + HDS Thunder 9570v
E4500, 8 x 400MHz UltraSPARC II, 3GB RAM,
HDS Thunder 9570v, 2GB mirrored battery-backed cache, RAID5 with a
bunch of 146GB 10,000RPM FC drives.  LUN is on single 2GB FC fabric
connection.
Veritas filesystem (VxFS)
Solaris 8.

Test methodology:

All test runs were done with CHUNKSIZE 8 * 1024, CHUNKS 2 * 1024,
FILESIZE_MULTIPLIER 2, and SLEEP 5.  So a total of 16MB was sequentially
written for each benchmark.

Results are in microseconds.

PLATFORM:       Athlon Linux
buffered:       48220
fsync:          74854397
fdatasync:      75061357
open_sync:      73869239
open_datasync:  74748145
Notes:  System mostly idle.  Even during tests, top showed about 95%
idle.  Something's not right on this box.  All sync methods similarly
horrible on this system.

PLATFORM:       Mac Linux
buffered:       58912
fsync:          1539079
fdatasync:      769058
open_sync:      767094
open_datasync:  763074
Notes: system mostly idle.  fsync seems worst.  Otherwise, they seem
pretty equivalent.  This is the fastest system tested.

PLATFORM:       HP Intel Linux
buffered:       33026
fsync:          29330067
fdatasync:      28673880
open_sync:      8783417
open_datasync:  8747971
Notes: system idle.  O_SYNC and O_DSYNC methods seem to be a lot
better on this platform than fsync & fdatasync.

PLATFORM:       Dell Intel OpenBSD
buffered:       511890
fsync:          1769190
fdatasync:      --------
open_sync:      1748764
open_datasync:  1747433
Notes: system idle.  I couldn't locate fdatasync() on this box, so I
couldn't test it.  All sync methods seem equivalent and are very fast --
though still trail the old Mac.

PLATFORM:       SUN Ultra2
buffered:       1814824
fsync:          73954800
fdatasync:      52594532
open_sync:      34405585
open_datasync:  13883758
Notes:  system mostly idle, with occasional spikes from 1-10% utilization.
It looks like substantial difference between each sync method, with
O_DSYNC the best and fsync() the worst.  There is substantial
difference between the open* and f* methods.

PLATFORM:       SUN E4500 + HDS Thunder 9570v
buffered:       233947
fsync:          57802065
fdatasync:      56631013
open_sync:      2362207
open_datasync:  1976057
Notes:  host about 30% idle, but the array tested on was completely idle.
Something looks seriously not right about fsync and fdatasync -- write
cache seems to have no effect on them.  As for write cache, that
probably explains the 2 seconds or so for the open_sync and
open_datasync methods.

--------------

Thanks for reading...I look forward to feedback, and hope to be helpful in
this effort!

Mark


Вложения

В списке pgsql-performance по дате отправления:

Предыдущее
От: Christopher Browne
Дата:
Сообщение: Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables
Следующее
От: Mark Cotner
Дата:
Сообщение: Re: Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables