Обсуждение: pg_test_fsync: "Invalid argument" in the middle of a test

Поиск
Список
Период
Сортировка

pg_test_fsync: "Invalid argument" in the middle of a test

От
Marti Raudsepp
Дата:
Hi list,

I'm in the middle of setting up a new machine and there's something
odd in pg_test_fsync output. Does anyone have ideas why open_sync
tests would fail in the middle?:

         4 *  4kB open_sync writes         89.322 ops/sec   11195 usecs/op
         8 *  2kB open_sync writes      write failed: Invalid argument

Happens every time I run it. strace reveals that the first 2kB write fails:

open("./pg_test_fsync.out", O_RDWR|O_SYNC|O_DIRECT) = 5
alarm(5)                                = 0
write(5, "[...]", 2048) = -1 EINVAL (Invalid argument)

This is on Ubuntu 13.10 (kernel 3.11) with XFS (mount ed with noatime,
no other customizations). Using the LSI SAS 2008 RAID controller
branded as Fujitsu D2607 (latest firmware) and megaraid_sas driver.
There are no warnings or anything in dmesg or other logs. This does
not occur on other Ubuntu 13.10 installations which have different
storage stacks.

The timings are too fast, as well, since it's backed by four 15k
drives in RAID10, no battery and no cache. The write failure does not
occur on ext4 in the same setup, but the timings are still too fast.

Regards,
Marti

----
Full pg_test_fsync output:

5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                    1575.148 ops/sec     635 usecs/op
        fdatasync                        1460.741 ops/sec     685 usecs/op
        fsync                            1362.300 ops/sec     734 usecs/op
        fsync_writethrough                            n/a
        open_sync                        1528.402 ops/sec     654 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
        open_datasync                     106.022 ops/sec    9432 usecs/op
        fdatasync                        1300.160 ops/sec     769 usecs/op
        fsync                            1353.178 ops/sec     739 usecs/op
        fsync_writethrough                            n/a
        open_sync                         108.378 ops/sec    9227 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
         1 * 16kB open_sync write        1405.532 ops/sec     711 usecs/op
         2 *  8kB open_sync writes        108.439 ops/sec    9222 usecs/op
         4 *  4kB open_sync writes         89.322 ops/sec   11195 usecs/op
         8 *  2kB open_sync writes      write failed: Invalid argument


Re: pg_test_fsync: "Invalid argument" in the middle of a test

От
Marti Raudsepp
Дата:
On Tue, Feb 11, 2014 at 12:20 AM, Marti Raudsepp <marti@juffo.org> wrote:
> This is on Ubuntu 13.10 (kernel 3.11) with XFS (mount ed with noatime,
> no other customizations).

I managed to track this down; XFS doesn't allow using O_DIRECT for
writes smaller than the filesystem's sector size (probably same on
other FSes). The XFS filesystem created by the Ubuntu installer uses
4kB sectors, for some weird reason:

# xfs_info /dev/sda1
meta-data=/dev/disk/by-uuid/987c0579-bd67-4f80-bbc6-50f975ee4c1d
isize=256    agcount=16, agsize=4341104 blks
         =                       sectsz=4096  attr=2
[...]

Yet the storage stack knows they're 512-byte sectors:
# cat /sys/block/sda/queue/logical_block_size
512
# cat /sys/block/sda/queue/physical_block_size
512

A new fresh filesystem also properly uses 512B sectors:
# mkfs.xfs /dev/sda5
meta-data=/dev/sda5              isize=256    agcount=4, agsize=489856 blks
         =                       sectsz=512   attr=2, projid32bit=0
[...]

I will be submitting a patch for pg_test_fsync so it can survive write
failures in this situation.

----

I could still use some help with this part... Does anyone have
experience in setting up megaraid_sas for reliable fsyncs?

>         open_datasync                    1575.148 ops/sec     635 usecs/op
>         fdatasync                        1460.741 ops/sec     685 usecs/op
>         fsync                            1362.300 ops/sec     734 usecs/op
>         fsync_writethrough                            n/a
>         open_sync                        1528.402 ops/sec     654 usecs/op

Regards,
Marti


Re: pg_test_fsync: "Invalid argument" in the middle of a test

От
Bruce Momjian
Дата:
On Tue, Feb 11, 2014 at 01:28:01AM +0200, Marti Raudsepp wrote:
> On Tue, Feb 11, 2014 at 12:20 AM, Marti Raudsepp <marti@juffo.org> wrote:
> > This is on Ubuntu 13.10 (kernel 3.11) with XFS (mount ed with noatime,
> > no other customizations).
>
> I managed to track this down; XFS doesn't allow using O_DIRECT for
> writes smaller than the filesystem's sector size (probably same on
> other FSes). The XFS filesystem created by the Ubuntu installer uses
> 4kB sectors, for some weird reason:

I have added the attached, applied C comment about Direct I/O write
failures and mismatched block sizes.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + Everyone has their own god. +

Вложения

Re: pg_test_fsync: "Invalid argument" in the middle of a test

От
Alvaro Herrera
Дата:
Bruce Momjian wrote:
> On Tue, Feb 11, 2014 at 01:28:01AM +0200, Marti Raudsepp wrote:
> > On Tue, Feb 11, 2014 at 12:20 AM, Marti Raudsepp <marti@juffo.org> wrote:
> > > This is on Ubuntu 13.10 (kernel 3.11) with XFS (mount ed with noatime,
> > > no other customizations).
> >
> > I managed to track this down; XFS doesn't allow using O_DIRECT for
> > writes smaller than the filesystem's sector size (probably same on
> > other FSes). The XFS filesystem created by the Ubuntu installer uses
> > 4kB sectors, for some weird reason:
>
> I have added the attached, applied C comment about Direct I/O write
> failures and mismatched block sizes.

Would it be more useful to report the test as failed and continue with
other tests?

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: pg_test_fsync: "Invalid argument" in the middle of a test

От
Marti Raudsepp
Дата:
On Wed, Feb 12, 2014 at 10:46 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Would it be more useful to report the test as failed and continue with
> other tests?

Yeah, I think so, I'm planning to code this in the week. It's harder
than it sounds because the alarm() timer is still ticking. On POSIX it
can be cancelled with alarm(0), but the Windows code spawns a separate
thread for timing.

It seems that TerminateThread [1] could be used on Windows. It has
many caveats, but should be safe for our purposes. Or we could only
implement error handling on POSIX and call exit(1) on Windows.

[1] http://msdn.microsoft.com/en-us/library/windows/desktop/ms686717%28v=vs.85%29.aspx

Regards,
Marti