Обсуждение: Hardware for a database server

Поиск
Список
Период
Сортировка

Hardware for a database server

От
Erwin Brandstetter
Дата:
Hi List!

I have to purchase new hardware for my company's new dedicated PG
database server. As it is my first time shopping for this special
purpose, I would feel less nervous if I could get some knowledgeable
feedback on this matter.


What is it for?

Short version:
It is for a medium sized database. ~ 50 users, < 5GB, biggest table < 1
million tupples, 60 tables, lots of indices, triggers, rules, and other
objects.

Long version:
The server is going to run Debian with PG 7.4.x. and will be a
dedicated database server.
The database is going to hold data on upcoming events for print in a
weekly magazine and for various websites fed by it (Falter in Vienna,
Austria - if u wonder). The websites will be fed by slave db's on other
machines (mySQL or PG as well, replicating relevant data from the
master daily or several times a day), so there wont be too much load on
the master from that. The server in question will serve the web-
frontends for data-input mainly. Apache and php will be used.

There will be about 50 users hacking in data but hardly ever more than
25 at a time. 5-10 normally.

The Database holds about 60 tables and 150 indices. Lots of views,
triggers and rules. I expect the largest table to grow by 20.000 rows
per week and never get above 1 million rows. The size of the database
will strongly depend on whether we include blobs or store them
separately, which has not been decided yet. If we don't inline the
blobs, I don't expect the database to grow over 5 GB, less than 1 GB in
the beginning. Blobs will use several times that. All in all I would
not call it a large database. There will be several smaller databases
on the same server, but all of them together not as big as the first
one. And we expect the thing to grow, but not so as to spend more money
on hardware right now.


What will I purchase?

CPU:
Single AMD Opteron.
Opteron, because i plan to migrate to amd64 debian as soon as Debian
has a stable release.
Single, because multi-CPU would only make sense if the CPU could ever
get the bottleneck. But I don't have to fear that, right? No need for a
dual-cpu setup?
Should I consider Athlon 64 FX or Athlon 64? I guess socket 940 has
more future than socket 754, right?

Motherboard:
?? According to CPU, Raid controller and 2 NIC integrated maybe? ..

Controller / Hard Discs:
RAID 5 with 4+ discs including a hot spare. But SCSI or SATA?
I am undecided on this. Until about a year ago, I would have said SCSI,
period. But I have read of SATA RAIDs for entry-level-servers doing
quite well and Linux dealing with it ever more smoothly. ([1], [2])
So I wonder if it is still a good decission to spend 3 times the money
per gigabyte on SCSI?
And do 3ware Controllers still have the best driver support under
Linux?
Any harddisks known to be especially apt for databases (hi I/O load
..)?

Power supply:
Secured with UPS, auto-shutdown before power fails, so do I need my
RAID controller battery-backed still?

RAM:
As much as the motherboard will bear. 4 GB probably. This seems the
easyest point to decide on. Correct? DDR SDRAM PC333 or PC400?

Other:
2 NICs, ??



I appreciate any comments, hints or corrections.

Regards
Erwin Brandstetter

[1] http://www6.tomshardware.com/storage/20031114/index.html
[2] http://www.linuxmafia.com/faq/Hardware/sata.html

Re: Hardware for a database server

От
"scott.marlowe"
Дата:
On Wed, 10 Mar 2004, Erwin Brandstetter wrote:

> Hi List!

Howdy!

> Short version:
> It is for a medium sized database. ~ 50 users, < 5GB, biggest table < 1
> million tupples, 60 tables, lots of indices, triggers, rules, and other
> objects.

SNIP!

> What will I purchase?
>
> CPU:
> Single AMD Opteron.
> Opteron, because i plan to migrate to amd64 debian as soon as Debian
> has a stable release.
> Single, because multi-CPU would only make sense if the CPU could ever
> get the bottleneck. But I don't have to fear that, right? No need for a
> dual-cpu setup?
> Should I consider Athlon 64 FX or Athlon 64? I guess socket 940 has
> more future than socket 754, right?

I would recommend going with a dual CPU machine.  while having lotsa CPUs
isn't like to be necessary for what you're doing, dual CPUs are not quite
the same thing.  In most cases having dual CPUs allows the OS to run on
one CPU while the database runs on the other, so to speak.  I.e. there's
enough going on to use a second processor a fair bit.  After that, it
really depends on how CPU intensive your database use is.

Generally I've found dual CPU machines to be more responsive than single
CPU machines under heavy load.

> Controller / Hard Discs:
> RAID 5 with 4+ discs including a hot spare. But SCSI or SATA?
> I am undecided on this. Until about a year ago, I would have said SCSI,
> period. But I have read of SATA RAIDs for entry-level-servers doing
> quite well and Linux dealing with it ever more smoothly.

Standard IDE drives have an issue that all the ones I've tested so far,
and presumably, most of the rest lie about fsync, and therefore are not
guaranteed to have a coherent database on them should you lose power
during a transaction.  If SATA drives in fact have proper fsyncing with
write caching, then they're a great choice.  You might want to test one or
two before commiting to a rack of them.

SCSI drives seem to pass with flying colors.  With a battery backed
caching RAID controller you can get very good results.

RAID5 with a small number of drives is fine for good read performance, but
not as good under a heavily written environment like RAID 1+0 is.  For >6
drives, RAID5 starts to catch up in a heavily written environment as the
number of platters available to spread the writes out on goes up.

That said, we get great performance from RAID1 on our LSI megaraid, nearly
as good as a 5 drive RAID5 for reads, and better for writes.

> ([1], [2])
> So I wonder if it is still a good decission to spend 3 times the money
> per gigabyte on SCSI?

Only if your data is important.  In this instance, it sounds like most of
what you're holding is coming from other sources, and losing a days worth
of data is no big deal, since you can get it back.  If that's the case,
back up every night, turn off fsync, and run on a rack of IDE or SATA
drives, whichever are cheaper per meg, and spend your money on memory for
the server.

> And do 3ware Controllers still have the best driver support under
> Linux?

LSI/Megaraid's megaraid2 driver is very fast and very stable.  the adaptec
drive seems to be working pretty well nowadays as well.  Installation on
our Dell boxes with the adaptec were much more difficult than the
megaraid2 driver, which uses dkms (dynamic kernel module system) which is
a very cool system.  you install the dkms rpm, then when you install a
source rpm for a drive, the dkms package kicks in, compiles it, puts it in
the right directory, and you're ready to use it.

I've not used the 3ware controller before.

> Any harddisks known to be especially apt for databases (hi I/O load
> ..)?

I really like the performance my Seagate Barracuda 36 gig 10krpm drives
give me.  If it's mostly read, just throw as many drives at it as you can
on fast busses.  Aggregate bandwidth is almost always the real key to fast
performance.


> Power supply:
> Secured with UPS, auto-shutdown before power fails, so do I need my
> RAID controller battery-backed still?

Yep.  Power supplies fail, motherboards fry and take out the power rail
every so often.  Idiots trip over power cords.  hehe.  been there, done
that, got the TShirt.

> RAM:
> As much as the motherboard will bear. 4 GB probably. This seems the
> easyest point to decide on. Correct? DDR SDRAM PC333 or PC400?

If you're looking at 64 bit machines, most of those can hold >4 gig, at
least 8 gig nowadays.  Don't buy tons today, but do buy enough to
interleave access if you have >1 CPU.  Opterons can interleave access, and
theoretically each CPU could get max memory bandwidth if you have enough
banks to allow interleaving.  So it's theoretically possible for an
SMP machine with 8 sticks totalling 1 gig could outrun the same machine
with 2 sticks totalling 2 gigs, since there'd be a 75% chance that the
second CPU accessing memory would not be in contention with the first CPU.

If you're looking at 32 bit machines, stick to 2 gigs unless you will
regularly be accessing more than that, as going beyond 2 gigs invokes a 10
to 15% performance hit due to the goofy switching schema used there.
Spend the money on more hard drives.

Look at the performance tuning article on varlena:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html



Re: Hardware for a database server

От
William Yu
Дата:
Erwin Brandstetter wrote:
> CPU:
> Single AMD Opteron.
> Opteron, because i plan to migrate to amd64 debian as soon as Debian
> has a stable release.
> Single, because multi-CPU would only make sense if the CPU could ever
> get the bottleneck. But I don't have to fear that, right? No need for a
> dual-cpu setup?
> Should I consider Athlon 64 FX or Athlon 64? I guess socket 940 has
> more future than socket 754, right?

Until Socket 939 is available, the A64FX is the same CPU as an Opteron
1xx. I'd definitely say stick with 940 for server CPUs because 754/939
does not support registered memory. For server, you will want ECC and
registered because unbuffered memory not only takes a performance hit at
high densities but are much more prone to errors and failure.

> Controller / Hard Discs:
> RAID 5 with 4+ discs including a hot spare. But SCSI or SATA?
> I am undecided on this. Until about a year ago, I would have said SCSI,
> period. But I have read of SATA RAIDs for entry-level-servers doing
> quite well and Linux dealing with it ever more smoothly. ([1], [2])
> So I wonder if it is still a good decission to spend 3 times the money
> per gigabyte on SCSI?

SATA still doesn't have TCQ which is a big big big deal for multi-user
databases. It has simple command queueing -- better than nothing -- but
random access will kill your performance. I know this from experience.

The other item is that you will have a hard time finding a SATA
controller with a battery backed cache. This allows you to safely turn
write caching on -- otherwise, a server crash can corrupt your database.

As for RAID5, RAID0+1 runs quite a bit faster but requires more disks
for the same amount of disk space.

> And do 3ware Controllers still have the best driver support under
> Linux?

If you do go SATA, 3ware is pretty much the only choice. I've tried a
few other vendors and their drivers for Linux are absolutely horrid.
E.g. timeouts, crashes, kernel panics.

> Power supply:
> Secured with UPS, auto-shutdown before power fails, so do I need my
> RAID controller battery-backed still?

You'd need a redundant powersupply -- however, you still would not be
protected from OS crashes. E.g. DB half-written, half in OS cache and
kernel locks up -- recipe for corruption.

> RAM:
> As much as the motherboard will bear. 4 GB probably. This seems the
> easyest point to decide on. Correct? DDR SDRAM PC333 or PC400?

Here's a possible reason to go a 2x opteron MB. They usually have double
the memory slots so you could go to 8GB of RAM (without paying the
astronomical amounts for 2GB/4GB DIMMs).

As for the memory type, it's gotta be ECC/registered. ECC/reg DDR333 is
readily available -- 400 is not as this time so expect to pay a premium.


Re: Hardware for a database server

От
Greg Spiegelberg
Дата:
scott.marlowe wrote:
> On Wed, 10 Mar 2004, Erwin Brandstetter wrote:
>
>>Controller / Hard Discs:
>>RAID 5 with 4+ discs including a hot spare. But SCSI or SATA?
>>I am undecided on this. Until about a year ago, I would have said SCSI,
>>period. But I have read of SATA RAIDs for entry-level-servers doing
>>quite well and Linux dealing with it ever more smoothly.
>
>
> Standard IDE drives have an issue that all the ones I've tested so far,
> and presumably, most of the rest lie about fsync, and therefore are not
> guaranteed to have a coherent database on them should you lose power
> during a transaction.  If SATA drives in fact have proper fsyncing with
> write caching, then they're a great choice.  You might want to test one or
> two before commiting to a rack of them.

I won't debate the issues with standalone/internal ATA type drives...
especially with Scott... however most RAID subsystems regardless of
the drive technology used are viable options and a SATA RAID subsystem
attached via fibre/SAN which has an internal battery backed cache and
additional external UPS should not be discounted due to the overall
shortcomings of ATA.  External RAID subsystems get around the many
issues and limitations of internal controllers such as those offered
by 3ware, Adaptec and LSI.

I believe this to the point where I have recommended and we here are
purchasing a Candera system that is SATA connected via fibre and we're
a EMC-Hitachi-IBM storage partner/reseller/consulting shop.


>>Power supply:
>>Secured with UPS, auto-shutdown before power fails, so do I need my
>>RAID controller battery-backed still?
>
> Yep.  Power supplies fail, motherboards fry and take out the power rail
> every so often.  Idiots trip over power cords.  hehe.  been there, done
> that, got the TShirt.

Double ditto.


--
Greg Spiegelberg
  Sr. Product Development Engineer
  Cranel, Incorporated.
  Phone: 614.318.4314
  Fax:   614.431.8388
  Email: gspiegelberg@Cranel.com
Cranel. Technology. Integrity. Focus.



Re: Hardware for a database server

От
Erwin Brandstetter
Дата:
Yo Scott, Greg & William!

I tried to send the following message last week, but my posting never
got here. Guess my provider has fucked up. So here goes again, sorry
for late reply.

For knowledgeable feedback I have asked, and that's what I have got.
Thanx a lot. My thanks also Jim Wilson who added some very good points
in c.d.p.general.

Taken all into account, it has made me change my plans concerning CPU &
RAM: Dual-CPU (not single) with  2 GB of RAM (not 4) seems to be the
better solution to start with.

I will go for the slowest Opterons (2 x Opteron 240), which will serve
both performance & cost efficiency. They'll get 4 x 512 MB DDR SDRAM,
PC333 most likely, which should suffice for quite some time. That takes
the 2GB limit for 32bit OS into account and leaves room for upgrade to
4 GB later when i switch to AMD64 Linux (if more RAM should be needed
at all)..

Only the storage gives me a hard time. I got a lot of input from you
all. Finally I decided to go for a simple SATA RAID 1 for the time
being. Two of the new Western Digital Western Digital Raptor HDDs (10k
SATA) might suffice after all.
If that should turn out to be a bottleneck, i will add an SCSI RAID 5
system, while the OS stays on one of the SATA drives. The mainboard
TYAN Thunder K8S Pro provides a PCI X slot ready for a SCSI controller
card.

My sympathy for postgresql has prooved right once more. You gave me
very profound advice. Thanx a lot!


Best Regards
Erwin Brandstetter