Обсуждение: disk I/O problems and Solutions

Поиск
Список
Период
Сортировка

disk I/O problems and Solutions

От
Alan McKay
Дата:
Hey folks,

CentOS / PostgreSQL shop over here.

I'm hitting 3 of my favorite lists with this, so here's hoping that
the BCC trick is the right way to do it :-)

We've just discovered thanks to a new Munin plugin
http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
that our production DB is completely maxing out in I/O for about a 3
hour stretch from 6am til 9am
This is "device utilization" as per the last graph at the above link.

Load went down for a while but is now between 70% and 95% sustained.
We've only had this plugin going for less than a day so I don't really
 have any more data going back further.  But we've suspected a disk
issue for some time - just have not been able to prove it.

Our system
IBM 3650 - quad 2Ghz e5405 Xeon
8K SAS RAID Controller
6 x 300G 15K/RPM SAS Drives
/dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
/dev/sdb - 3 drives configured as RAID5 for 600G for the DB
1 drive as a global hot spare

/dev/sdb is the one that is maxing out.

We need to have a very serious look at fixing this situation.   But we
don't have the money to be experimenting with solutions that won't
solve our problem.  And our budget is fairly limited.

Is there a public library somewhere of disk subsystems and their
performance figures?  Done with some semblance of a standard
benchmark?

One benchmark I am partial to is this one :
http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmarking_notes#dd_test

One thing I am thinking of in the immediate term is taking the RAID5 +
hot spare and converting it to RAID10 with the same amount of storage.
 Will that perform much better?

In general we are planning to move away from RAID5 toward RAID10.

We also have on order an external IBM array (don't have the exact name
on hand but model number was 3000) with 12 drive bays.  We ordered it
with just 4 x SATAII drives, and were going to put it on a different
system as a RAID10.  These are just 7200 RPM drives - the goal was
cheaper storage because the SAS drives are about twice as much per
drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
the SATA2 drives are about $200 each and the SAS 300G drives about
$500 each.

So I have 2 thoughts with this 12 disk array.   1 is to fill it up
with 12 x cheap SATA2 drives and hope that even though the spin-rate
is a lot slower, that the fact that it has more drives will make it
perform better.  But somehow I am doubtful about that.   The other
thought is to bite the bullet and fill it up with 300G SAS drives.

any thoughts here?  recommendations on what to do with a tight budget?
  It could be the answer is that I just have to go back to the bean
counters and tell them we have no choice but to start spending some
real money.  But on what?  And how do I prove that this is the only
choice?


--
“Don't eat anything you've ever seen advertised on TV”
         - Michael Pollan, author of "In Defense of Food"

Re: disk I/O problems and Solutions

От
Flavio Henrique Araque Gurgel
Дата:
----- "Alan McKay" <alan.mckay@gmail.com> escreveu:
> CentOS / PostgreSQL shop over here.
>
> Our system
> IBM 3650 - quad 2Ghz e5405 Xeon
> 8K SAS RAID Controller
> 6 x 300G 15K/RPM SAS Drives
> /dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
> /dev/sdb - 3 drives configured as RAID5 for 600G for the DB
> 1 drive as a global hot spare
>
> /dev/sdb is the one that is maxing out.

What are you calling "maxing out"? Excess IOPS, MB/s or high response times?
Each of these have different approaches when trying to find out a solution.

> Is there a public library somewhere of disk subsystems and their
> performance figures?  Done with some semblance of a standard
> benchmark?

you should try using iostat or sar utilities. Both can give you complete reports of your online disk activity and
probablywere the tools in the backend used by your tool as the frontend. 

It's very important to figure out that the percentage seen is all about CPU time used when in an I/O operation. If you
have100% you have to worry but not too desperatelly. 
What matters most for me is the disk operation response time and queue size. If you have these numbers increasing then
yourdatabase performance will suffer. 

Always check the man pages for iostat to understand what those numbers are all about.

> One thing I am thinking of in the immediate term is taking the RAID5
> +
> hot spare and converting it to RAID10 with the same amount of
> storage.
>  Will that perform much better?

Usually yes for write operations because the raid controller doesn't have to calculate parity for the spare disk.
You'llhave some improvements in the disk seek time and your database will be snapier if you have an OLTP application. 

RAID5 can handle more IOPS, otherwise. It can be good for your pg_xlog directory, but the amount of disk space needed
forWAL is just a small amount. 

> In general we are planning to move away from RAID5 toward RAID10.
>
> We also have on order an external IBM array (don't have the exact
> name
> on hand but model number was 3000) with 12 drive bays.  We ordered it
> with just 4 x SATAII drives, and were going to put it on a different
> system as a RAID10.  These are just 7200 RPM drives - the goal was
> cheaper storage because the SAS drives are about twice as much per
> drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
> the SATA2 drives are about $200 each and the SAS 300G drives about
> $500 each.

I think it's a good choice.

> So I have 2 thoughts with this 12 disk array.   1 is to fill it up
> with 12 x cheap SATA2 drives and hope that even though the spin-rate
> is a lot slower, that the fact that it has more drives will make it
> perform better.  But somehow I am doubtful about that.   The other
> thought is to bite the bullet and fill it up with 300G SAS drives.
>
> any thoughts here?  recommendations on what to do with a tight
> budget?

Take you new storage system when it arrives, make it RAID10 and administer it using LVM in Linux.
If you need greater performance later you will be able to make stripes between raid arrays.

Regards

Flavio Henrique A. Gurgel
Consultor -- 4Linux
tel. 55-11-2125.4765
fax. 55-11-2125.4777
www.4linux.com.br

Re: disk I/O problems and Solutions

От
Scott Marlowe
Дата:
On Fri, Oct 9, 2009 at 10:45 AM, Alan McKay <alan.mckay@gmail.com> wrote:
> Hey folks,
>
> CentOS / PostgreSQL shop over here.
>
> I'm hitting 3 of my favorite lists with this, so here's hoping that
> the BCC trick is the right way to do it :-)

I added pgsql-performance back in in my reply so we can share with the
rest of the class.

> We've just discovered thanks to a new Munin plugin
> http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
> that our production DB is completely maxing out in I/O for about a 3
> hour stretch from 6am til 9am
> This is "device utilization" as per the last graph at the above link.

What does vmstat, sar, or top have to say about it? If you're at 100%
IO Wait, then yeah, your disk subsystem is your bottleneck.

> Our system
> IBM 3650 - quad 2Ghz e5405 Xeon
> 8K SAS RAID Controller

Does this RAID controller have a battery backed cache on it?

> 6 x 300G 15K/RPM SAS Drives
> /dev/sda - 2 drives configured as a RAID 1 for 300G for the OS
> /dev/sdb - 3 drives configured as RAID5 for 600G for the DB
> 1 drive as a global hot spare
>
> /dev/sdb is the one that is maxing out.

Yeah, with RAID-5 that's not surprising.  Especially if you've got
even a decent / small percentage of writes in the mix, RAID-5 is gonna
be pretty slow.

> We need to have a very serious look at fixing this situation.   But we
> don't have the money to be experimenting with solutions that won't
> solve our problem.  And our budget is fairly limited.
>
> Is there a public library somewhere of disk subsystems and their
> performance figures?  Done with some semblance of a standard
> benchmark?

Not that I know of, and if there is, I'm as eager as you to find it.

This mailing list's archives are as close as I've come to finding it.

> One benchmark I am partial to is this one :
> http://wiki.postgresql.org/wiki/PgCon_2009/Greg_Smith_Hardware_Benchmarking_notes#dd_test
>
> One thing I am thinking of in the immediate term is taking the RAID5 +
> hot spare and converting it to RAID10 with the same amount of storage.
>  Will that perform much better?

Almost certainly.

> In general we are planning to move away from RAID5 toward RAID10.
>
> We also have on order an external IBM array (don't have the exact name
> on hand but model number was 3000) with 12 drive bays.  We ordered it
> with just 4 x SATAII drives, and were going to put it on a different
> system as a RAID10.  These are just 7200 RPM drives - the goal was
> cheaper storage because the SAS drives are about twice as much per
> drive, and it is only a 300G drive versus the 1T SATA2 drives.   IIRC
> the SATA2 drives are about $200 each and the SAS 300G drives about
> $500 each.

> So I have 2 thoughts with this 12 disk array.   1 is to fill it up
> with 12 x cheap SATA2 drives and hope that even though the spin-rate
> is a lot slower, that the fact that it has more drives will make it
> perform better.  But somehow I am doubtful about that.   The other
> thought is to bite the bullet and fill it up with 300G SAS drives.

I'd give the SATA drives a try.  If they aren't fast enough, then
everybody in the office gets a free / cheap drive upgrade in their
desktop machine.  More drives == faster RAID-10 up to the point you
saturate your controller / IO bus on your machine

Re: disk I/O problems and Solutions

От
David Rees
Дата:
On Fri, Oct 9, 2009 at 9:45 AM, Alan McKay <alan.mckay@gmail.com> wrote:
> We've just discovered thanks to a new Munin plugin
> http://blogs.amd.co.at/robe/2008/12/graphing-linux-disk-io-statistics-with-munin.html
> that our production DB is completely maxing out in I/O for about a 3
> hour stretch from 6am til 9am
> This is "device utilization" as per the last graph at the above link.

As Flavio mentioned, we really need to know if it's seek limited or
bandwidth limited, but I suspect it's seek limited.  Actual data from
vmstat or sar would be helpful.

Also knowing what kind of raid controller is being used and whether or
not it has a BBU or not would be useful.

And finally, you didn't mention what version of CentOS or PostgreSQL.

> One thing I am thinking of in the immediate term is taking the RAID5 +
> hot spare and converting it to RAID10 with the same amount of storage.
>  Will that perform much better?

Depends on how the array is IO limited.  But in general, RAID10 >
RAID5 in terms of performance.

> So I have 2 thoughts with this 12 disk array.   1 is to fill it up
> with 12 x cheap SATA2 drives and hope that even though the spin-rate
> is a lot slower, that the fact that it has more drives will make it
> perform better.  But somehow I am doubtful about that.   The other
> thought is to bite the bullet and fill it up with 300G SAS drives.

Not a bad idea.  Keep in mind that your 15k drives can seek about
twice as fast as 7200 rpm drives, so you'll probably need close to
twice as many to match performance with the same configuration.

If you're random IO limited, though, RAID5 will only write about as
fast as a single disk (but sometimes a LOT slower!) - a 12-disk RAID10
will write about 6 times faster than a single disk.  So overall, the
12 disk 7.2k RAID10 array should be significantly faster than the 3
disk 15k RAID5 array.

> any thoughts here?  recommendations on what to do with a tight budget?
>  It could be the answer is that I just have to go back to the bean
> counters and tell them we have no choice but to start spending some
> real money.  But on what?  And how do I prove that this is the only
> choice?

It's hard to say without knowing all the information.  One free
possibility would be to move the log data onto the RAID1 from the
RAID5, thus splitting up your database load over all of your disks.
You can do this by moving the pg_xlog folder to the RAID1 array and
symlink it back to your data folder.  Should be able to try this with
just a few seconds of downtime.

-Dave

Re: disk I/O problems and Solutions

От
Scott Carey
Дата:
>
>> any thoughts here?  recommendations on what to do with a tight budget?
>>  It could be the answer is that I just have to go back to the bean
>> counters and tell them we have no choice but to start spending some
>> real money.  But on what?  And how do I prove that this is the only
>> choice?
>
> It's hard to say without knowing all the information.  One free
> possibility would be to move the log data onto the RAID1 from the
> RAID5, thus splitting up your database load over all of your disks.
> You can do this by moving the pg_xlog folder to the RAID1 array and
> symlink it back to your data folder.  Should be able to try this with
> just a few seconds of downtime.
>

Do the above first.
Then, on your sdb, set the scheduler to 'deadline'
If it is ext3, mount sdb as 'writeback,noatime'.

If you have your pg_xlog on your RAID 5, using ext3 in 'ordered' mode, then
you are going to be continuously throwing small writes at it.  If this is
the case then the above configuration changes will easily double your
performance, most likely.


> -Dave
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>