Re: Asking for assistance in determining storage requirements

Поиск
Список
Период
Сортировка
От Scott Marlowe
Тема Re: Asking for assistance in determining storage requirements
Дата
Msg-id dcc563d10907170823l5aac6663if263608b8d6a0133@mail.gmail.com
обсуждение исходный текст
Ответ на Asking for assistance in determining storage requirements  (Chris Barnes <compuguruchrisbarnes@hotmail.com>)
Список pgsql-general
On Thu, Jul 9, 2009 at 9:15 AM, Chris
Barnes<compuguruchrisbarnes@hotmail.com> wrote:
> You assistance is appreciated.
>
> I have question regarding disk storage for postgres servers
>
> We are thinking long term about scalable storage and performance and would
> like some advise or feedback about what other people are using.
>
> We would like to get as much performance from our file systems as possible.
>
> We use ibm 3650 quad processor with onboard SAS controller ( 3GB/Sec) with
> 15,000rpm drives.
>
> We use raid 1 for the centos operating system and the wal archive logs.
>
> The postgres database is on 5 drives configured as raid 5 with a global hot
> spare.

OK, two things jump out at me.  One is that you aren't using a
hardware RAID controller with battery backed cache, and you're using
RAID-5.

For most non-db applications, RAID-5 and no battery backed cache is
just fine.  For some DB applications like a reporting db or batch
processing it's ok too.  For DB applications that handle lots of small
transactions, it's a really bad choice.

Looking through the pgsql-performance archives, you'll see RAID-10 and
HW RAID with battery backed cache mentioned over and over again, and
for good reasons.  RAID-10 is much more resilient, and a good HW RAID
controller with battery backed cache can re-order writes into groups
that are near each other on the same drive pair to make overall
throughput higher, as well as making burst throughput to be higher as
well by fsyncing immediately when you issue a write.

I'm assuming you have 8 hard drives to play with.  If that's the case,
you can have a RAID-1 for the OS etc and a RAID-10 with 4 disks and
two hot spares, OR a RAID-10 with 6 disks and no hot spares.  As long
as you pay close attention to your server and catch failed drives and
replace them by hand that might work, but it really sits wrong with
me.

> We are curious about using SAN with fiber channel hba and if anyone else
> uses this technology.

Yep, again, check the pgsql-perform archives.  Note that the level of
complexity is much higher, as is the cost, and if you're talking about
a dozen or two dozen drives, you're often much better off just having
a good direct attached set of disks, either with an embedded RAID
controller, or JBOD and using an internal RAID controller to handle
them.  The top of the line RAID controllers that can handle 24 or so
disks run $1200 to $1500.  Taking the cost of the drives out of the
equation, I'm pretty sure any FC/SAN setup is gonna cost a LOT more
than that single RAID card.  I can buy a 16 drive 32TB DAS box for
about $6k to $7k or so, plug it into a simple but fast SCSI controller
($400 tops) and be up in a few minutes.  Setting up a new SAN is never
that fast, easy, or cheap.

OTOH, if you've got a dozen servers that need lots and lots of
storage, a SAN will start making more sense since it makes managing
lots of hard drives easier.

> We would also like to know if people have preference to the level of raid
> with/out striping.

RAID-10, then RAID-10 again, then RAID-1.  RAID-6 for really big
reporting dbs where storage is more important than performance, and
the data is mostly read anyways.  RAID-5 is to be avoided, period.  If
you have 6 disks in a RAID-6 with no spare, you're better off than a
RAID-5 with 5 disks and a spare, as in RAID-6 the "spare" is kind of
already built in.

В списке pgsql-general по дате отправления:

Предыдущее
От: Florian Chis
Дата:
Сообщение: Re: change database
Следующее
От: Andreas Wenk
Дата:
Сообщение: Re: psql \du [PATCH] extended \du with [+] - was missing