Re: Huge Data sets, simple queries

Поиск
Список
Период
Сортировка
От Luke Lonergan
Тема Re: Huge Data sets, simple queries
Дата
Msg-id 3E37B936B592014B978C4415F90D662D023F28D9@MI8NYCMAIL06.Mi8.com
обсуждение исходный текст
Ответ на Huge Data sets, simple queries  ("Mike Biamonte" <mike@dbeat.com>)
Ответы Re: Huge Data sets, simple queries  ("Jeffrey W. Baker" <jwbaker@acm.org>)
Re: Huge Data sets, simple queries  (Charles Sprickman <spork@bway.net>)
Re: Huge Data sets, simple queries  (hubert depesz lubaczewski <depesz@gmail.com>)
Список pgsql-performance
Depesz,

> [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of
> hubert depesz lubaczewski
> Sent: Sunday, January 29, 2006 3:25 AM
>
> hmm .. do i understand correctly that you're suggesting that
> using raid 10 and/or hardware raid adapter might hurt disc
> subsystem performance? could you elaborate on the reasons,
> please? it's not that i'm against the idea - i'm just curious
> as this is very "against-common-sense". and i always found it
> interesting when somebody states something that uncommon...

See previous postings on this list - often when someone is reporting a
performance problem with large data, the answer comes back that their
I/O setup is not performing well.  Most times, people are trusting that
when they buy a hardware RAID adapter and set it up, that the
performance will be what they expect and what is theoretically correct
for the number of disk drives.

In fact, in our testing of various host-based SCSI RAID adapters (LSI,
Dell PERC, Adaptec, HP SmartArray), we find that *all* of them
underperform, most of them severely.  Some produce results slower than a
single disk drive.  We've found that some external SCSI RAID adapters,
those built into the disk chassis, often perform better.  I think this
might be due to the better drivers and perhaps a different marketplace
for the higher end solutions driving performance validation.

The important lesson we've learned is to always test the I/O subsystem
performance - you can do so with a simple test like:
  time bash -c "dd if=/dev/zero of=bigfile bs=8k count=4000000 && sync"
  time dd if=bigfile of=/dev/null bs=8k

If the answer isn't something close to the theoretical rate, you are
likely limited by your RAID setup.  You might be shocked to find a
severe performance problem.  If either is true, switching to software
RAID using a simple SCSI adapter will fix the problem.

BTW - we've had very good experiences with the host-based SATA adapters
from 3Ware.  The Areca controllers are also respected.

Oh - and about RAID 10 - for large data work it's more often a waste of
disk performance-wise compared to RAID 5 these days.  RAID5 will almost
double the performance on a reasonable number of drives.

- Luke


В списке pgsql-performance по дате отправления:

Предыдущее
От: Michael Stone
Дата:
Сообщение: Re: Huge Data sets, simple queries
Следующее
От: "Jeffrey W. Baker"
Дата:
Сообщение: Re: Huge Data sets, simple queries