Re: performance on new linux box

Поиск
Список
Период
Сортировка
От Scott Carey
Тема Re: performance on new linux box
Дата
Msg-id 1E1BB935-72EF-4312-81B6-5E033D66BDA1@richrelevance.com
обсуждение исходный текст
Ответ на Re: performance on new linux box  (Scott Marlowe <scott.marlowe@gmail.com>)
Список pgsql-performance
On Jul 15, 2010, at 6:22 PM, Scott Marlowe wrote:

> On Thu, Jul 15, 2010 at 10:30 AM, Scott Carey <scott@richrelevance.com> wrote:
>>
>> On Jul 14, 2010, at 7:50 PM, Ben Chobot wrote:
>>
>>> On Jul 14, 2010, at 6:57 PM, Scott Carey wrote:
>>>
>>>> But none of this explains why a 4-disk raid 10 is slower than a 1 disk system.  If there is no write-back caching
onthe RAID, it should still be similar to the one disk setup. 
>>>
>>> Many raid controllers are smart enough to always turn off write caching on the drives, and also disable the feature
ontheir own buffer without a BBU. Add a BBU, and the cache on the controller starts getting used, but *not* the cache
onthe drives. 
>>
>> This does not make sense.
>
> Basically, you can have cheap, fast and dangerous (drive with write
> cache enabled, which responds positively to fsync even when it hasn't
> actually fsynced the data.  You can have cheap, slow and safe with a
> drive that has a cache but since it'll be fsyncing it all the the time
> the write cache won't actually get used, or fast, expensive, and safe,
> which is what a BBU RAID card gets by saying the data is fsynced when
> it's actually just in cache, but a safe cache that won't get lost on
> power down.
>
> I don't find it that complicated.

It doesn't make sense that a raid 10 will be slower than a 1-disk setup unless the former respects fsync() and the
latterdoes not.  Individual drive write cache does not explain the situation.  That is what does not make sense. 

When in _write-through_ mode, there is no reason to turn off the drive's write cache unless the drive does not properly
respectits cache-flush command, or the RAID card is too dumb to issue cache-flush commands.  The RAID card simply has
toissue its writes, then issue the flush commands, then return to the OS when those complete.  With drive write caches
on,this is perfectly safe.  The only way it is unsafe is if the drive lies and returns from a cache flush before the
datafrom its cache is actually flushed. 

Some SSD's on the market currently lie.  A handful of the thousands of all hard drive models in the server, desktop,
andlaptop space in the last decade did not respect the cache flush command properly, and none of them in the SAS/SCSI
or'enterprise SATA' space lie to my knowledge.  Information on this topic has come across this list several times. 

The explanation why one setup respects fsync() and another does not almost always lies in the FS + OS combination.
HFS+on OSX does not respect fsync.  ext3 until recently only did fdatasync() when you told it to fsync() (which is fine
forpostgres' transaction log anyway). 

A raid card, especially with any SAS/SCSI drives has no reason to turn off the drive's write cache unless it _wants_ to
returnto the OS before the data is on the drive.  That condition occurs in write-back cache mode when the RAID card's
cacheis safe via a battery or some other mechanism.  In that case, it should turn off the drive's write cache so that
itcan be sure  that data is on disk when a power fails without having to call the cache-flush command on every write.
Thatway, it can remove data from its RAM as soon as the drive returns from the write. 
In write-through mode it should turn the caches back on and rely on the flush command to pass through direct writes,
cacheflush demands, and barrier requests.  It could optionally turn the caches off, but that won't improve data safety
unlessthe drive cannot faithfully flush its cache. 



В списке pgsql-performance по дате отправления:

Предыдущее
От: Scott Marlowe
Дата:
Сообщение: Re: performance on new linux box
Следующее
От: Ben Chobot
Дата:
Сообщение: Re: performance on new linux box