Re: [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

Поиск
Список
Период
Сортировка
От Greg Smith
Тема Re: [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+
Дата
Msg-id 4CD47CE1.20800@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
Tom Lane wrote:
> If open_dsync is so bad for performance on Linux, maybe it's bad
> everywhere?  Should we be rethinking the default preference order?
>   

And I've seen the expected sync write performance gain over fdatasync on 
a system with a battery-backed cache running VxFS on Linux, because 
working open_[d]sync means O_DIRECT writes bypassing the OS cache, and 
therefore reducing cache pollution from WAL writes.  This doesn't work 
by default on Solaris because they have a special system call you have 
to execute for direct output, but if you trick the OS into doing that 
via mount options you can observe it there too.  The last serious tests 
of this area I saw on that platform were from Jignesh, and they 
certainly didn't show a significant performance regression running in 
sync mode.  I vaguely recall seeing a set once that showed a minor loss 
compared to fdatasync, but it was too close to make any definitive 
statement about reordering.

I haven't seen any report yet of a serious performance regression in the 
new Linux case that was written by someone who understands fully how 
fsync and drive cache flushing are supposed to interact.  It's been 
obvious for a year now that the reports from Phoronix about this had no 
idea what they were actually testing.  I didn't see anything from 
Marti's report that definitively answers whether this is anything other 
than Linux finally doing the right thing to flush drive caches out when 
sync writes happen.  There may be a performance regression here related 
to WAL data going out in smaller chunks than it used to, but in all the 
reports I've seen it that hasn't been isolated well enough to consider 
making any changes yet--to tell if it's a performance loss or a 
reliability gain we're seeing.

I'd like to see some output from the 9.0 test_fsync on one of these 
RHEL6 systems on a system without a battery backed write cache as a 
first step here.  That should start to shed some light on what's 
happening.  I just bumped up the priority on the pending upgrade of my 
spare laptop to the RHEL6 beta I had been trying to find time for, so I 
can investigate this further myself.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: "Make" versus effective stack limit in regression tests
Следующее
От: Richard Broersma
Дата:
Сообщение: Re: CREATE CONSTRAINT TRIGGER