Обсуждение: raid10 hard disk choice

Список
Период
Сортировка

raid10 hard disk choice

От
Linos
Дата:
Hello,
    i have to buy a new server and in the budget i have (small) i have to select
one of this two options:

-4 sas 146gb 15k rpm raid10.
-8 sas 146gb 10k rpm raid10.

The server would not be only dedicated to postgresql but to be a file server,
the rest of options like plenty of ram and battery backed cache raid card are
done but this two different hard disk configuration have the same price and i am
not sure what it is better.

If the best option it is different for postgresql that for a file server i would
like to know too, thanks.

Regards,
Miguel Angel.

Re: raid10 hard disk choice

От
Matthew Wakeling
Дата:
On Thu, 21 May 2009, Linos wrote:
>     i have to buy a new server and in the budget i have (small) i have to
> select one of this two options:
>
> -4 sas 146gb 15k rpm raid10.
> -8 sas 146gb 10k rpm raid10.

It depends what you are doing. I think in most situations, the second
option is better, but there may be a few situations where the reverse is
true.

Basically, the first option will only be faster if you are doing lots of
seeking (small requests) in a single thread. As soon as you go
multi-threaded or are looking at sequential scans, you're better off with
more discs.

Matthew

--
 Experience is what allows you to recognise a mistake the second time you
 make it.

Re: raid10 hard disk choice

От
Merlin Moncure
Дата:
On Thu, May 21, 2009 at 8:47 AM, Linos <> wrote:
> Hello,
>        i have to buy a new server and in the budget i have (small) i have to
> select one of this two options:
>
> -4 sas 146gb 15k rpm raid10.
> -8 sas 146gb 10k rpm raid10.
>
> The server would not be only dedicated to postgresql but to be a file
> server, the rest of options like plenty of ram and battery backed cache raid
> card are done but this two different hard disk configuration have the same
> price and i am not sure what it is better.
>
> If the best option it is different for postgresql that for a file server i
> would like to know too, thanks.

I would say go with the 10k drives.  more space, flexibility (you can
dedicate a volume to WAL), and more total performance on paper.  I
would also, if you can afford it and they fit, get two small sata
drives, mount raid 1 and put the o/s on those.

merlin

Re: raid10 hard disk choice

От
Robert Schnabel
Дата:
Matthew Wakeling wrote:
> On Thu, 21 May 2009, Linos wrote:
>>     i have to buy a new server and in the budget i have (small) i
>> have to select one of this two options:
>>
>> -4 sas 146gb 15k rpm raid10.
>> -8 sas 146gb 10k rpm raid10.
>
> It depends what you are doing. I think in most situations, the second
> option is better, but there may be a few situations where the reverse
> is true.
>
> Basically, the first option will only be faster if you are doing lots
> of seeking (small requests) in a single thread. As soon as you go
> multi-threaded or are looking at sequential scans, you're better off
> with more discs.
>
> Matthew
>
I agree.  I think you would be better off with more disks  I know from
my own experience when I went from 8 73gb 15k drives to 16 73gb 15k
drives I noticed a big difference in the amount of time it took to run
my queries.  I can't give you hard numbers but most of my queries take
hours to run so you tend to notice when they finish 20-30 minutes
sooner.  The second option also doubles your capacity which in general
is a good idea.  It's always easier to slightly overbuild than try and
fix a storage problem.

Might I also suggest that you pick up at least one spare drive while
you're at it.

This may be somewhat of a tangent but it speaks to having a spare drive
on hand.  A word of warning for anyone out there considering the Seagate
1.5TB SATA drives (ST31500341AS).  (I use them for an off-site backup of
a backup array not pg)  I'm going through a fiasco right now with these
drives and I wish I had purchased more when I did.  I built a backup
array with 16 of these back in October and it works great.  In October
these drives shipped with firmware SD17.  I needed to add another 16
drive array but the ST31500341AS drives that are currently shipping have
a non-flashable CC1H firmware that will not work on high port count
Adaptec cards (5XXXX) which is what I have.  It is now impossible to
find any of these drives with firmware compatible with my controller,
trust me I spent a couple hours on the phone with Seagate.  When I built
the first array I bought a single spare drive.  As soon as two drives
die I'm going to be in the position of having to either scrap all of
them or buy a new controller that will work with the new firmware.  If I
hadn't bought that extra drive the array would be dead as soon as one of
the drives goes.

My point is... if you have the means, buy at least one spare while you can.

Bob


Re: raid10 hard disk choice

От
"Joshua D. Drake"
Дата:
On Thu, 2009-05-21 at 10:25 -0400, Merlin Moncure wrote:
> On Thu, May 21, 2009 at 8:47 AM, Linos <> wrote:
> > Hello,
> >        i have to buy a new server and in the budget i have (small) i have to
> > select one of this two options:
> >
> > -4 sas 146gb 15k rpm raid10.
> > -8 sas 146gb 10k rpm raid10.
> >
> > The server would not be only dedicated to postgresql but to be a file
> > server, the rest of options like plenty of ram and battery backed cache raid
> > card are done but this two different hard disk configuration have the same
> > price and i am not sure what it is better.
> >
> > If the best option it is different for postgresql that for a file server i
> > would like to know too, thanks.
>
> I would say go with the 10k drives.  more space, flexibility (you can
> dedicate a volume to WAL), and more total performance on paper.  I
> would also, if you can afford it and they fit, get two small sata
> drives, mount raid 1 and put the o/s on those.

+1 on that.

Joshua D. Drake


>
> merlin
>
--
PostgreSQL - XMPP: 
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997


Re: raid10 hard disk choice

От
Scott Marlowe
Дата:
On Thu, May 21, 2009 at 8:34 AM, Robert Schnabel <> wrote:
> the phone with Seagate.  When I built the first array I bought a single
> spare drive.  As soon as two drives die I'm going to be in the position of
> having to either scrap all of them or buy a new controller that will work
> with the new firmware.  If I hadn't bought that extra drive the array would
> be dead as soon as one of the drives goes.
>
> My point is... if you have the means, buy at least one spare while you can.

I'd go shopping for more spares on ebay now...

Re: raid10 hard disk choice

От
Craig James
Дата:
Matthew Wakeling wrote:
> On Thu, 21 May 2009, Linos wrote:
>>     i have to buy a new server and in the budget i have (small) i have
>> to select one of this two options:
>>
>> -4 sas 146gb 15k rpm raid10.
>> -8 sas 146gb 10k rpm raid10.
>
> It depends what you are doing. I think in most situations, the second
> option is better, but there may be a few situations where the reverse is
> true.
>
> Basically, the first option will only be faster if you are doing lots of
> seeking (small requests) in a single thread. As soon as you go
> multi-threaded or are looking at sequential scans, you're better off
> with more discs.

Since you have to share the disks with a file server, which might be heavily used, the 8-disk array will probably be
bettereven if you're doing lots of seeking in a single thread. 

Craig

Re: raid10 hard disk choice

От
Robert Haas
Дата:
On Thu, May 21, 2009 at 8:59 AM, Matthew Wakeling <> wrote:
> On Thu, 21 May 2009, Linos wrote:
>>
>>        i have to buy a new server and in the budget i have (small) i have
>> to select one of this two options:
>>
>> -4 sas 146gb 15k rpm raid10.
>> -8 sas 146gb 10k rpm raid10.
>
> It depends what you are doing. I think in most situations, the second option
> is better, but there may be a few situations where the reverse is true.

One possible case of this - I believe that 15K drives will allow you
to commit ~250 times per second (15K/60) vs. ~166 times per second
(10K/60).  If you have a lot of small write transactions, this might
be an issue.

...Robert

Re: raid10 hard disk choice

От
Scott Marlowe
Дата:
On Thu, May 21, 2009 at 2:29 PM, Robert Haas <> wrote:
> On Thu, May 21, 2009 at 8:59 AM, Matthew Wakeling <> wrote:
>> On Thu, 21 May 2009, Linos wrote:
>>>
>>>        i have to buy a new server and in the budget i have (small) i have
>>> to select one of this two options:
>>>
>>> -4 sas 146gb 15k rpm raid10.
>>> -8 sas 146gb 10k rpm raid10.
>>
>> It depends what you are doing. I think in most situations, the second option
>> is better, but there may be a few situations where the reverse is true.
>
> One possible case of this - I believe that 15K drives will allow you
> to commit ~250 times per second (15K/60) vs. ~166 times per second
> (10K/60).  If you have a lot of small write transactions, this might
> be an issue.

But in a RAID-10 you aggreate pairs like RAID-0, so you could write
250(n/2) times per second on 15k where n=4 and 166(n/2) for 10k drives
where n=8.  So 500 versus 664... ?  Or am I getting it wrong.

Re: raid10 hard disk choice

От
Scott Carey
Дата:
On 5/21/09 2:41 PM, "Scott Marlowe" <> wrote:

> On Thu, May 21, 2009 at 2:29 PM, Robert Haas <> wrote:
>> On Thu, May 21, 2009 at 8:59 AM, Matthew Wakeling <>
>> wrote:
>>> On Thu, 21 May 2009, Linos wrote:
>>>>
>>>>        i have to buy a new server and in the budget i have (small) i have
>>>> to select one of this two options:
>>>>
>>>> -4 sas 146gb 15k rpm raid10.
>>>> -8 sas 146gb 10k rpm raid10.
>>>
>>> It depends what you are doing. I think in most situations, the second option
>>> is better, but there may be a few situations where the reverse is true.
>>
>> One possible case of this - I believe that 15K drives will allow you
>> to commit ~250 times per second (15K/60) vs. ~166 times per second
>> (10K/60).  If you have a lot of small write transactions, this might
>> be an issue.
>
> But in a RAID-10 you aggreate pairs like RAID-0, so you could write
> 250(n/2) times per second on 15k where n=4 and 166(n/2) for 10k drives
> where n=8.  So 500 versus 664... ?  Or am I getting it wrong.

From the original message:

" The server would not be only dedicated to postgresql but to be a file
server,
the rest of options like plenty of ram and battery backed cache raid card
are
done but this two different hard disk configuration have the same price and
i am
not sure what it is better."


So, with a write-back cache battery backed up raid card, xlog writes won't
be an issue.

>
> --
> Sent via pgsql-performance mailing list ()
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: raid10 hard disk choice

От
Robert Haas
Дата:
On Thu, May 21, 2009 at 5:41 PM, Scott Marlowe <> wrote:
> On Thu, May 21, 2009 at 2:29 PM, Robert Haas <> wrote:
>> On Thu, May 21, 2009 at 8:59 AM, Matthew Wakeling <> wrote:
>>> On Thu, 21 May 2009, Linos wrote:
>>>>
>>>>        i have to buy a new server and in the budget i have (small) i have
>>>> to select one of this two options:
>>>>
>>>> -4 sas 146gb 15k rpm raid10.
>>>> -8 sas 146gb 10k rpm raid10.
>>>
>>> It depends what you are doing. I think in most situations, the second option
>>> is better, but there may be a few situations where the reverse is true.
>>
>> One possible case of this - I believe that 15K drives will allow you
>> to commit ~250 times per second (15K/60) vs. ~166 times per second
>> (10K/60).  If you have a lot of small write transactions, this might
>> be an issue.
>
> But in a RAID-10 you aggreate pairs like RAID-0, so you could write
> 250(n/2) times per second on 15k where n=4 and 166(n/2) for 10k drives
> where n=8.  So 500 versus 664... ?  Or am I getting it wrong.

Well, that would be true if every write used a different disk, but I
don't think that will be the case in practice.  The WAL writes are
very small, so often you'll have multiple writes even to the same
block.  But even if they're to different blocks they're likely to be
in the same RAID stripe.

...Robert

Re: raid10 hard disk choice

От
Scott Carey
Дата:
On 5/21/09 3:05 PM, "Robert Haas" <> wrote:

> On Thu, May 21, 2009 at 5:41 PM, Scott Marlowe <>
> wrote:
>> On Thu, May 21, 2009 at 2:29 PM, Robert Haas <> wrote:
>>> On Thu, May 21, 2009 at 8:59 AM, Matthew Wakeling <>
>>> wrote:
>>>> On Thu, 21 May 2009, Linos wrote:
>>>>>
>>>>>        i have to buy a new server and in the budget i have (small) i have
>>>>> to select one of this two options:
>>>>>
>>>>> -4 sas 146gb 15k rpm raid10.
>>>>> -8 sas 146gb 10k rpm raid10.
>>>>
>>>> It depends what you are doing. I think in most situations, the second
>>>> option
>>>> is better, but there may be a few situations where the reverse is true.
>>>
>>> One possible case of this - I believe that 15K drives will allow you
>>> to commit ~250 times per second (15K/60) vs. ~166 times per second
>>> (10K/60).  If you have a lot of small write transactions, this might
>>> be an issue.
>>
>> But in a RAID-10 you aggreate pairs like RAID-0, so you could write
>> 250(n/2) times per second on 15k where n=4 and 166(n/2) for 10k drives
>> where n=8.  So 500 versus 664... ?  Or am I getting it wrong.
>
> Well, that would be true if every write used a different disk, but I
> don't think that will be the case in practice.  The WAL writes are
> very small, so often you'll have multiple writes even to the same
> block.  But even if they're to different blocks they're likely to be
> in the same RAID stripe.

Disk count and stripe size don't have much to do with it, the write cache
merges write requests and the client (the wal log write) doesn't have to
wait on anything.  The RAID card can merge and order the writes, so it can
go nearly at sequential transfer rate, limited more by other concurrent
pressure on the raid card's cache than anything else.

Since WAL log requests are sequential (but small) this provides huge gains
and a large multiplier over the raw iops of the drive.

>
> ...Robert
>
> --
> Sent via pgsql-performance mailing list ()
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>


Re: raid10 hard disk choice

От
Greg Smith
Дата:
On Thu, 21 May 2009, Scott Marlowe wrote:

> But in a RAID-10 you aggreate pairs like RAID-0, so you could write
> 250(n/2) times per second on 15k where n=4 and 166(n/2) for 10k drives
> where n=8.  So 500 versus 664... ?  Or am I getting it wrong.

Adding more spindles doesn't improve the fact that the disks can only
commit once per revolution.  WAL writes are way too fine grained for them
to get split across stripes to improve the commit rate.

--
* Greg Smith  http://www.gregsmith.com Baltimore, MD

Re: raid10 hard disk choice

От
Greg Smith
Дата:
On Thu, 21 May 2009, Robert Schnabel wrote:

> A word of warning for anyone out there considering the Seagate 1.5TB
> SATA drives (ST31500341AS)...I'm going through a fiasco right now with
> these drives and I wish I had purchased more when I did.

Those drives are involved in the worst firmware debacle Seagate has had in
years, so no surprise they're causing problems for you just like so many
others.  I don't think you came to the right conclusion for how to avoid
this pain in the future though--buying more garbage drives isn't really
satisfying.

What you should realize is to never assemble a production server using
newly designed drives.  Always stay at least 6 months and at least one
generation behind the state of the art.  All the drive manufacturers right
now are lucky if they can deliver a reliable 1TB drive, nobody has a
reliable 1.5TB or larger drive yet.  (Check out the miserable user ratings
for all the larger capacity drives available right now on sites like
newegg.com if you don't believe me)  Right now, Seagate's 1.5TB drive is 7
months old, and I'd still consider it bleeding edge for server use.

--
* Greg Smith  http://www.gregsmith.com Baltimore, MD

Re: raid10 hard disk choice

От
Linos
Дата:
Thanks for all the suggestions i will go with 8 10k disks, well 9 if you count
the spare now that i am scared :)

Regards,
Miguel Angel.

Re: raid10 hard disk choice

От
Robert Schnabel
Дата:
Greg Smith wrote:
> On Thu, 21 May 2009, Robert Schnabel wrote:
>> A word of warning for anyone out there considering the Seagate 1.5TB
>> SATA drives (ST31500341AS)...I'm going through a fiasco right now
>> with these drives and I wish I had purchased more when I did.
> I don't think you came to the right conclusion for how to avoid this
> pain in the future though--buying more garbage drives isn't really
> satisfying.
No, the original drives I have work fine.  The problem, as you point
out, is that Seagate changed the firmware and made it so that you cannot
flash it to a different version.

> What you should realize is to never assemble a production server using
> newly designed drives.
I totally agree.  I'm using these drives for an off-site backup of a
backup.  There is no original data on these.  I needed the capacity.  I
was willing to accept the performance/reliability hit considering the $$/TB.

Bob


Re: raid10 hard disk choice

От
Greg Smith
Дата:
On Fri, 22 May 2009, Robert Schnabel wrote:

> No, the original drives I have work fine.  The problem, as you point out, is
> that Seagate changed the firmware and made it so that you cannot flash it to
> a different version.

The subtle point here is that whether a drive has been out long enough to
have a stable firmware is very much a component of its overall quality and
reliability--regardless of whether the drive works fine in any one system
or not.  The odds of you'll get a RAID compability breaking firmware
change in the first few months a drive is on the market are painfully
high.

You don't have to defend that it was the right decision for you, I was
just uncomfortable with the way you were extrapolating your experience to
provide a larger rule of thumb.  Allocated hot spares and cold spares on
the shelf are both important, but for most people those should be a safety
net on top of making the safest hardware choice, rather than as a way to
allow taking excessive risks in what you buy.

--
* Greg Smith  http://www.gregsmith.com Baltimore, MD

Re: raid10 hard disk choice

От
Scott Marlowe
Дата:
On Fri, May 22, 2009 at 9:08 AM, Greg Smith <> wrote:
> On Fri, 22 May 2009, Robert Schnabel wrote:
>
>> No, the original drives I have work fine.  The problem, as you point out,
>> is that Seagate changed the firmware and made it so that you cannot flash it
>> to a different version.
>
> The subtle point here is that whether a drive has been out long enough to
> have a stable firmware is very much a component of its overall quality and
> reliability--regardless of whether the drive works fine in any one system or
> not.  The odds of you'll get a RAID compability breaking firmware change in
> the first few months a drive is on the market are painfully high.

Also keep in mind that 1.5 and 2TB drives that are out right now are
all consumer grade drives, built to be put into a workstation singly
or maybe in pairs.  It's much less common to see such a change in
server class drives, because the manufacturers know where they'll be
used, and also because the server grade drives usually piggy back on
the workstation class drives for a lot of their tech and bios, so the
need for sudden changes are less common.

Re: raid10 hard disk choice

От
Greg Smith
Дата:
On Fri, 22 May 2009, Scott Marlowe wrote:

> It's much less common to see such a change in server class drives

This is a good point, and I just updated
http://wiki.postgresql.org/wiki/SCSI_vs._IDE/SATA_Disks with a section
about this topic (the last one under "ATA Disks").

--
* Greg Smith  http://www.gregsmith.com Baltimore, MD

Re: raid10 hard disk choice

От
Robert Schnabel
Дата:
Greg Smith wrote:
> On Fri, 22 May 2009, Scott Marlowe wrote:
>
>> It's much less common to see such a change in server class drives
>
> This is a good point, and I just updated
> http://wiki.postgresql.org/wiki/SCSI_vs._IDE/SATA_Disks with a section
> about this topic (the last one under "ATA Disks").
And I can confirm that point because the 1TB SAS drives (ST31000640SS) I
just received to replace all the 1.5TB drives have the same firmware as
the ones I purchased back in October.  60% more $$, 50% less capacity...
but they work :-)  Lesson learned.