Обсуждение: [OT] RAID controllers blocking one another?

Поиск
Список
Период
Сортировка

[OT] RAID controllers blocking one another?

От
"Sean Davis"
Дата:
We have a machine that serves as a fileserver and a database server.  Our server hosts a raid array of 40 disk drives, attached to two3-ware cards, one 9640SE-24 and one 9640SE-16. We have noticed that activity on one controller blocks access on the second controller, not only for disk-IO but also the command line tools which become unresponsive for the inactive controller.   The controllers are sitting in adjacent PCI-express slots on a machine with dual-dual AMD and 16GB of RAM.  Has anyone else noticed issues like this?  Throughput for either controller is a pretty respectable 150-200MB/s writing and somewhat faster for reading, but the "blocking" is problematic, as the machine is serving multiple purposes. 

I know this is off-topic, but I know lots of folks here deal with very large disk arrays; it is hard to get real-world input on machines such as these. 


Thanks,
Sean

Re: [OT] RAID controllers blocking one another?

От
"Scott Marlowe"
Дата:
On Jan 17, 2008 2:17 PM, Sean Davis <sdavis2@mail.nih.gov> wrote:
> We have a machine that serves as a fileserver and a database server.  Our
> server hosts a raid array of 40 disk drives, attached to two3-ware cards,
> one 9640SE-24 and one 9640SE-16. We have noticed that activity on one
> controller blocks access on the second controller, not only for disk-IO but
> also the command line tools which become unresponsive for the inactive
> controller.   The controllers are sitting in adjacent PCI-express slots on a
> machine with dual-dual AMD and 16GB of RAM.  Has anyone else noticed issues
> like this?  Throughput for either controller is a pretty respectable
> 150-200MB/s writing and somewhat faster for reading, but the "blocking" is
> problematic, as the machine is serving multiple purposes.
>
> I know this is off-topic, but I know lots of folks here deal with very large
> disk arrays; it is hard to get real-world input on machines such as these.

Sounds like they're sharing something they shouldn't be.  I'm not real
familiar with PCI-express.  Aren't those the ones that use up to 16
channels for I/O?  Can you divide it to 8 and 8 for each PCI-express
slot in the BIOS maybe, or something like that?

Just a SWAG.

Re: [OT] RAID controllers blocking one another?

От
Greg Smith
Дата:
On Thu, 17 Jan 2008, Scott Marlowe wrote:

> On Jan 17, 2008 2:17 PM, Sean Davis <sdavis2@mail.nih.gov> wrote:
>> two3-ware cards, one 9640SE-24 and one 9640SE-16
> Sounds like they're sharing something they shouldn't be.  I'm not real
> familiar with PCI-express.  Aren't those the ones that use up to 16
> channels for I/O?  Can you divide it to 8 and 8 for each PCI-express
> slot in the BIOS maybe, or something like that?

I can't find the 9640SE-24/16 anywhere, but presuming these are similar to
(or are actually) the 9650SE cards then each of them is using 8 lanes of
the 16 available.  I'd need to know the exact motherboard or system to
even have a clue what the options are for adjusting the BIOS and whether
they are shared or independant.

But I haven't seen one where there's any real ability to adjust how the
I/O is partitioned beyond adjusting what slot you plug things into so
that's probably a dead end anyway.  Given the original symptoms, one thing
I would be suspicious of though is whether there's some sort of IRQ
conflict going on.  Sadly we still haven't left that kind of junk behind
even on current PC motherboards.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: [OT] RAID controllers blocking one another?

От
"Steinar H. Gunderson"
Дата:
On Thu, Jan 17, 2008 at 03:07:02PM -0600, Scott Marlowe wrote:
> Sounds like they're sharing something they shouldn't be.  I'm not real
> familiar with PCI-express.  Aren't those the ones that use up to 16
> channels for I/O?  Can you divide it to 8 and 8 for each PCI-express
> slot in the BIOS maybe, or something like that?

PCI-E is a point-to-point-system.

/* Steinar */
--
Homepage: http://www.sesse.net/

Re: [OT] RAID controllers blocking one another?

От
"Sean Davis"
Дата:


On Jan 17, 2008 6:23 PM, Greg Smith <gsmith@gregsmith.com> wrote:
On Thu, 17 Jan 2008, Scott Marlowe wrote:

> On Jan 17, 2008 2:17 PM, Sean Davis <sdavis2@mail.nih.gov> wrote:
>> two3-ware cards, one 9640SE-24 and one 9640SE-16
> Sounds like they're sharing something they shouldn't be.  I'm not real
> familiar with PCI-express.  Aren't those the ones that use up to 16
> channels for I/O?  Can you divide it to 8 and 8 for each PCI-express
> slot in the BIOS maybe, or something like that?

I can't find the 9640SE-24/16 anywhere, but presuming these are similar to
(or are actually) the 9650SE cards then each of them is using 8 lanes of
the 16 available.  I'd need to know the exact motherboard or system to
even have a clue what the options are for adjusting the BIOS and whether
they are shared or independant.

But I haven't seen one where there's any real ability to adjust how the
I/O is partitioned beyond adjusting what slot you plug things into so
that's probably a dead end anyway.  Given the original symptoms, one thing
I would be suspicious of though is whether there's some sort of IRQ
conflict going on.  Sadly we still haven't left that kind of junk behind
even on current PC motherboards.

Thanks, Greg.  After a little digging, 3-ware suggested moving one of the cards, also.  We will probably give that a try.  I'll also look into the bios, but since the machine is running as a fileserver, there is precious little time for downtime tinkering.  FYI, here are the specs on the server.

http://www.thinkmate.com/System/8U_Dual_Xeon_i2SS40-8U_Storage_Server

Sean


Re: [OT] RAID controllers blocking one another?

От
Greg Smith
Дата:
On Fri, 18 Jan 2008, Sean Davis wrote:

> FYI, here are the specs on the server.
> http://www.thinkmate.com/System/8U_Dual_Xeon_i2SS40-8U_Storage_Server

Now we're getting somewhere.  I'll dump this on-list as it's a good
example of how to fight this class of performance problems.

The usual troubleshooting procedure is to figure out how the motherboard
is mapping all the I/O internally and then try to move things out of the
same path.  That tells us that you have an Intel S5000PSL motherboard, and
the tech specs are at
http://support.intel.com/support/motherboards/server/s5000psl/sb/CS-022619.htm

What you want to stare at is the block diagram that's Figure 10, page 27
(usually this isn't in the motherboard documentation, and instead you have
to drill down into the chipset documentation to find it).  Slots 5 and 6
that have PCI Express x16 connectors (but run at x8 speed) both go
straight into the memory hub.  Slots 3 and 4, which are x8 but run at x4
speed, go through the I/O controller first.  Those should be slower, so if
you put a card into there it will have a degraded top-end performance
compared to slots 5/6.

Line that up with the layout in Figure 2, page 17, and you should be able
to get an idea what the possibilities are for moving the cards around and
what the trade-offs involved are.  Ideally you'd want both 3Ware cards to
be in slots 5+6, but if that's your current configuration you could try
moving the less important of the two (maybe the one with less drives) to
either slot 3/4 and see if the contention you're seeing drops.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: [OT] RAID controllers blocking one another?

От
"Sean Davis"
Дата:


On Jan 18, 2008 2:14 PM, Greg Smith <gsmith@gregsmith.com> wrote:
On Fri, 18 Jan 2008, Sean Davis wrote:

> FYI, here are the specs on the server.
> http://www.thinkmate.com/System/8U_Dual_Xeon_i2SS40-8U_Storage_Server

Now we're getting somewhere.  I'll dump this on-list as it's a good
example of how to fight this class of performance problems.

The usual troubleshooting procedure is to figure out how the motherboard
is mapping all the I/O internally and then try to move things out of the
same path.  That tells us that you have an Intel S5000PSL motherboard, and
the tech specs are at
http://support.intel.com/support/motherboards/server/s5000psl/sb/CS-022619.htm

What you want to stare at is the block diagram that's Figure 10, page 27
(usually this isn't in the motherboard documentation, and instead you have
to drill down into the chipset documentation to find it).  Slots 5 and 6
that have PCI Express x16 connectors (but run at x8 speed) both go
straight into the memory hub.  Slots 3 and 4, which are x8 but run at x4
speed, go through the I/O controller first.  Those should be slower, so if
you put a card into there it will have a degraded top-end performance
compared to slots 5/6.

Line that up with the layout in Figure 2, page 17, and you should be able
to get an idea what the possibilities are for moving the cards around and
what the trade-offs involved are.  Ideally you'd want both 3Ware cards to
be in slots 5+6, but if that's your current configuration you could try
moving the less important of the two (maybe the one with less drives) to
either slot 3/4 and see if the contention you're seeing drops.

Greg, this is GREAT information and I'm glad you stepped through the process.  It is really interesting to see to what extent these slots that have similar names (or the same names) should be expected to behave so differently.  We'll try some of the things you suggest (although it will probably take a while to do) and if we come to any conclusions will let everyone know our conclusions. 

Sean


Re: [OT] RAID controllers blocking one another?

От
david@lang.hm
Дата:
On Thu, 17 Jan 2008, Sean Davis wrote:

> We have a machine that serves as a fileserver and a database server.  Our
> server hosts a raid array of 40 disk drives, attached to two3-ware cards,
> one 9640SE-24 and one 9640SE-16. We have noticed that activity on one
> controller blocks access on the second controller, not only for disk-IO but
> also the command line tools which become unresponsive for the inactive
> controller.   The controllers are sitting in adjacent PCI-express slots on a
> machine with dual-dual AMD and 16GB of RAM.  Has anyone else noticed issues
> like this?  Throughput for either controller is a pretty respectable
> 150-200MB/s writing and somewhat faster for reading, but the "blocking" is
> problematic, as the machine is serving multiple purposes.
>
> I know this is off-topic, but I know lots of folks here deal with very large
> disk arrays; it is hard to get real-world input on machines such as these.

there have been a lot of discussions on the linux-kernel mailing list over
the last several months on the topic of IO to one set of drives
interfearing with IO to another set of drives. The soon-to-be-released
2.6.24 kernel includes a substantial amount of work in this area that (at
least on initial reports) is showing significant improvements.

I haven't had the time to test this out yet, so I can't add personal
experiance, but it's definantly something to look at on a test system.

David Lang