Re: Database storage

Поиск
Список
Период
Сортировка
От Scott Marlowe
Тема Re: Database storage
Дата
Msg-id dcc563d10907091728l216ef97ck645b328d0ba06d9a@mail.gmail.com
обсуждение исходный текст
Ответ на Database storage  (nabble.30.miller_2555@spamgourmet.com)
Ответы Re: Database storage
Список pgsql-general
On Thu, Jul 9, 2009 at 5:40 PM, <nabble.30.miller_2555@spamgourmet.com> wrote:
> Hi -
>
> I'm running Postgres v8.3 on Fedora 10 (home machine - x86_64
> architecture). After realizing that the storage requirements of one of
> my databases will exceed 4Tb, I wanted to see if anyone had any
> suggestions as to hardware setup that works well with Postgres running
> on Linux. I have partitioned most of the database, so older data can
> go on slower devices with only 400-500Gb needed on faster devices.
> Redundancy is a requirement. General database usage is generally low,
> with burst input. For the curious, here's the general profile:
>    a) ~19,000,000 INSERT statements/ day (190 INSERTs/ commit blocks)
> for a total of ~10 Gb INSERT/day
>    b) INSERTs occur sequentially (daily batch)
>    c) Handful of SELECT statements run per day to generate a few
> dozen reports, but these are not resource-intensive.
>    d) Only 3-5 database users configured.
>
> Initial thought: Use an external multi-bay eSATA case and inserting
> some 7200RPM+, 32Mb cache 1T+ SATA drives running RAID 5.

While RAID-5 is generally not the best database setup, considering
that your inputs will be bulk you're probably ok.  Also, given that
2TB server class drives (5400rpm) are now out, and that they have
about the same throughput as 1TB 7200 RPM server class drives, you
might want to look into them as you can get by on fewer drives for the
same storage.

> A few specific questions:
>    1) Which components generally lead to bottlenecks (controller
> card, individual HD cache, HD rotational speed, other???)

IO is almost always your bottleneck on queries over large data sets.

>    2) Are there better solutions (broad question in terms of price/
> data availability)? Budget: $500 - $750 for the storage medium. Better
> question: Which setup was worked well with a Postgres implementation
> running on a similar budget?

$750 is about what a decent RAID controller would cost you, but again
it's likely that given your bulk import scenario,  you're probably ok
without one.  In this instance, you're probably best off with software
RAID than a cheap RAID card which will cost extra and probably be
slower than linux software RAID.

В списке pgsql-general по дате отправления:

Предыдущее
От: nabble.30.miller_2555@spamgourmet.com
Дата:
Сообщение: Database storage
Следующее
От: Greg Stark
Дата:
Сообщение: Re: SELECT DISTINCT very slow