Обсуждение: open_sync fails

Поиск
Список
Период
Сортировка

open_sync fails

От
Rick Weber
Дата:
Basic system setup:

Linux 2.4 kernel (heavily modified)
Dual core Athlon Opteron
4GB ECC RAM
SW RAID 10 configuration with 8 750 Gb disks (using only 500Gb of each
disk) connected via LSISAS1068 based card


While working on tuning my database, I was experimenting with changing
the wal_sync_method to try to find the optimal value.  The really odd
thing is when I switch to open_sync (O_SYNC), Postgres immediately fails
and gives me an error message of:

2008-07-22 11:22:37 UTC 19411 akamai [local] PANIC:  could not write to
log file 101, segment 40 at offset 1255
8336, length 2097152: No space left on device

Even running the test_fsync tool on this system gives me an error
message indicating O_SYNC isn't supported, and it promptly bails.

So I'm wondering what the heck is going on.  I've found a bunch of posts
that indicate O_SYNC may provide some extra throughput, but nothing
indicating that O_SYNC doesn't work.

Can anybody provide me any pointers on this?

Thanks

--Rick



Вложения

Re: open_sync fails

От
Tom Lane
Дата:
Rick Weber <riweber@akamai.com> writes:
> Basic system setup:
> Linux 2.4 kernel (heavily modified)

"Heavily modified" meaning what exactly?

Given that no one else has reported such a thing, and the obvious
bogosity of the errno code, I'd certainly first cast suspicion on the
kernel.

            regards, tom lane

Re: open_sync fails

От
Alvaro Herrera
Дата:
Rick Weber wrote:

> While working on tuning my database, I was experimenting with changing
> the wal_sync_method to try to find the optimal value.  The really odd
> thing is when I switch to open_sync (O_SYNC), Postgres immediately fails
> and gives me an error message of:
>
> 2008-07-22 11:22:37 UTC 19411 akamai [local] PANIC:  could not write to
> log file 101, segment 40 at offset 12558336, length 2097152: No space left on device

Sounds like a kernel bug to me, particularly because the segment is most
likely already 16 MB in length; we're only rewriting the contents, not
enlarging it.  Perhaps the kernel wanted to report a problem and chose
the wrong errno.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: open_sync fails

От
Rick Weber
Дата:
Definitely believable.  It gives me an internal avenue to chase down.

Thanks

--Rick



Alvaro Herrera wrote:
Rick Weber wrote:
 
While working on tuning my database, I was experimenting with changing  
the wal_sync_method to try to find the optimal value.  The really odd  
thing is when I switch to open_sync (O_SYNC), Postgres immediately fails  
and gives me an error message of:

2008-07-22 11:22:37 UTC 19411 akamai [local] PANIC:  could not write to  
log file 101, segment 40 at offset 12558336, length 2097152: No space left on device   
Sounds like a kernel bug to me, particularly because the segment is most
likely already 16 MB in length; we're only rewriting the contents, not
enlarging it.  Perhaps the kernel wanted to report a problem and chose
the wrong errno.
 
Вложения