At 03:26 PM 7/8/2002, Tom Lane wrote:
>Doug Fields <dfields-pg-general@pexicom.com> writes:
> > (gdb) where
> > #0 0x4028299d in fdatasync () from /lib/libc.so.6
> > #1 0x081049ae in pg_fdatasync (fd=53) at fd.c:233
> > #2 0x08088276 in issue_xlog_fsync () at xlog.c:3367
>
>Hmm. That leads to a different line of thought about where the problem
>is. Does the problem go away if you turn off fsync, or select a
>wal_sync_method other than fdatasync? (See postgresql.conf; not sure
>if you need a postmaster restart or just SIGHUP to change these.)
Interestingly enough, I was just testing that (the fsync off part).
FSYNC=OFF results:
I just turned fsync=off and completely restarted the postmaster (always
figure that's the safest thing to do).
What a difference. My checkpoints are just as long - they still take quite
a bit of time to process and do a lot of "blocks out" on the vmstat - but
queries now longer block waiting for them to finish.
Now, of course, the question becomes, with FSYNC=ON still enabled, which
WAL_SYNC_METHOD works best. As you note, fdatasync is the default. Let me
play some more...
FSYNC=ON results:
WAL_SYNC_METHOD settings:
FDATASYNC - problem exists (the original problem)
OPEN_SYNC - problem persists (same problem; things pile up behind a checkpoint)
OPEN_DATASYNC - gives runtime error: FATAL 1: invalid value for option
'WAL_SYNC_METHOD': 'OPEN_DATASYNC' and refuses to start
FSYNCA - gives runtime error: FATAL 1: invalid value for option
'WAL_SYNC_METHOD': 'FSYNCA'
and refuses to start
These last two are obviously not compiled into the backend by default on
Debian. Of note, in the fdatasync() man page: Currently (Linux 2.2)
fdatasync is equivalent to fsync.
Additional thoughts?
Thanks,
Doug