Re: [PERFORM] Arguments Pro/Contra Software Raid

Поиск
Список
Период
Сортировка
От Scott Marlowe
Тема Re: [PERFORM] Arguments Pro/Contra Software Raid
Дата
Msg-id 1147189769.9755.13.camel@state.g2switchworks.com
обсуждение исходный текст
Ответ на Arguments Pro/Contra Software Raid  (Hannes Dorbath <light@theendofthetunnel.de>)
Список pgsql-general
On Tue, 2006-05-09 at 04:16, Hannes Dorbath wrote:
> Hi,
>
> I've just had some discussion with colleagues regarding the usage of
> hardware or software raid 1/10 for our linux based database servers.
>
> I myself can't see much reason to spend $500 on high end controller
> cards for a simple Raid 1.
>
> Any arguments pro or contra would be desirable.
>
>  From my experience and what I've read here:
>
> + Hardware Raids might be a bit easier to manage, if you never spend a
> few hours to learn Software Raid Tools.

Depends.  Some hardware RAID cards aren't that easy to manage, and
sometimes, they won't let you do some things that software will.  I've
run into situations where a RAID controller kicked out two perfectly
good drives from a RAID 5 and would NOT accept them back.  All data
lost, and it would not be convinced to restart without formatting the
drives first.  arg!  With Linux kernel sw RAID, I've had a similar
problem pop up, and was able to make the RAID array take the drives
back.  Of course, this means that software RAID relies on you not being
stupid, because it will let you do things that are dangerous / stupid.

I found the raidtools on linux to be well thought out and fairly easy to
use.

> + There are situations in which Software Raids are faster, as CPU power
> has advanced dramatically in the last years and even high end controller
> cards cannot keep up with that.

The only times I've found software RAID to be faster was against the
hybrid hardware / software type RAID cards (i.e. the cheapies) or OLDER
RAID cards, that have a 33 MHz coprocessor or such.  Most modern RAID
controllers have coprocessors running at several hundred MHz or more,
and can compute parity and manage the array as fast as the attached I/O
can handle it.

The one thing a software RAID will never be able to match the hardware
RAID controller on is battery backed cache.

> + Using SATA drives is always a bit of risk, as some drives are lying
> about whether they are caching or not.

This is true whether you are using hardware RAID or not.  Turning off
drive caching seems to prevent the problem.  However, with a RAID
controller, the caching can then be moved to the BBU cache, while with
software RAID no such option exists.  Most SATA RAID controllers turn
off the drive cache automagically, like the escalades seem to do.

> + Using hardware controllers, the array becomes locked to a particular
> vendor. You can't switch controller vendors as the array meta
> information is stored proprietary. In case the Raid is broken to a level
> the controller can't recover automatically this might complicate manual
> recovery by specialists.

And not just a particular vendor, but likely a particular model and even
firmware revision.  For this reason, and 24/7 server should have two
RAID controllers of the same brand running identical arrays, then have
them set up as a mirror across the controllers, assuming you have
controllers that can run cooperatively.  This setup ensures that even if
one of your RAID controllers fails, you then have a fully operational
RAID array for as long as it takes to order and replace the bad
controller.  And having a third as a spare in a cabinet somewhere is
cheap insurance as well.

> + Even battery backed controllers can't guarantee that data written to
> the drives is consistent after a power outage, neither that the drive
> does not corrupt something during the involuntary shutdown / power
> irregularities. (This is theoretical as any server will be UPS backed)

This may be theoretically true, but all the battery backed cache units
I've used have brought the array up clean every time the power has been
lost to them.  And a UPS is no insurance against loss of power.
Cascading power failures are not uncommon when things go wrong.

Now, here's my take on SW versus HW in general:

HW is the way to go for situations where a battery backed cache is
needed.  Heavily written / updated databases are in this category.

Software RAID is a perfect match for databases with a low write to read
ratio, or where you won't be writing enough for the write performance to
be a big issue.  Many data warehouses fall into this category.  In this
case, a JBOD enclosure with a couple of dozen drives and software RAID
gives you plenty of storage for chicken feed.  If the data is all
derived from outside sources, then you can turn on the write cache in
the drives and turn off fsync and it will be plenty fast, just not crash
safe.

В списке pgsql-general по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Problem dropping a table
Следующее
От: "Joshua D. Drake"
Дата:
Сообщение: Re: Arguments Pro/Contra Software Raid