Обсуждение: Are we in the ballpark?

Поиск
Список
Период
Сортировка

Are we in the ballpark?

От
Wayne Conrad
Дата:
We're building a new database box.  With the help of Gregory Smith's
book, we're benchmarking the box: We want to know that we've set it up
right, we want numbers to go back to if we have trouble later, and we
want something to compare our _next_ box against.  What I'd like to know
is, are the performance numbers we're getting in the ballpark for the
class of hardware we've picked?

First, the setup:

CPU: Two AMD Opteron 6128 (Magny-Cours) 2000 mHz, eight cores each
RAM: DDR3-1333 64 GB (ECC)
RAID: 3Ware 9750 SAS2/SATA-II PCIe, 512 MB battery backed cache,
write-back caching enabled.
Drives: 16 Seagate ST3500414SS 500GB 7200RPM SAS, 16 MB cache:
   2 RAID1 ($PG_DATA/xlog)
   12 RAID10 ($PG_DATA)
   2 hot spare
PostgreSQL 8.4.1 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real
(Debian 4.3.4-2) 4.3.4, 64-bit
File system: XFS (nobarrier, noatime)
i/o scheduler: noop

Database config (differences from stock that might affect performance):
shared_buffers = 8192MB
temp_buffers = 16MB
work_mem = 192MB
maintenance_work_mem = 5GB
wal_buffers = 8MB
checkpoint_segments = 64
checkpoint_completion_target = 0.9
random_page_cost = 1.0
constraint_exclusion = on

Now, the test results:

Memtest86+ says our memory bandwidth is:
   L1 32,788 MB/S
   L2 is 10,050 MB/S
   L3 is 6,826 MB/S

Stream v5.9 says:
   1 core: 4,320
   2 cores: 8,387
   4 cores: 15,840
   8 cores: 23,088
   16 cores: 24,286

Bonnie++ (-f -n 0 -c 4)
   $PGDATA/xlog (RAID1)
     random seek: 369/sec
     block out: 87 MB/sec
     block in: 180 MB/sec
   $PGDATA (RAID10, 12 drives)
     random seek: 452
     block out: 439 MB/sec
     block in: 881 MB/sec

sysbench test of fsync (commit) rate:

   $PGDATA/xlog (RAID1)
     cache off: 29 req/sec
     cache on:  9,342 req/sec
   $PGDATA (RAID10, 12 drives)
     cache off: 61 req/sec
     cache on: 8,191 req/sec

pgbench-tools:

   Averages for test set 1 by scale:
                         avg_
   set     clients tps     latency    90%<     max_latency
   1     1     29141     0.248     0.342     5.453
   1     10     31467     0.263     0.361     7.148
   1     100     31081     0.265     0.363     7.843
   1     1000     29499     0.278     0.365     11.264

   Averages for test set 1 by clients:
                         avg_
   set     clients tps     latency    90%<     max_latency
   1     1     9527     0.102     0.105     1.5
   1     2     13850     0.14     0.195     5.316
   1     4     19148     0.19     0.251     2.228
   1     8     44101     0.179     0.248     2.557
   1     16     50311     0.315     0.381     11.057
   1     32     47765     0.666     0.989     24.076

We've used Brad Fitzpatrick's diskchecker script to show that the i/o
stack is telling the truth when it comes to fsync.

Are there any nails sticking up that we need to pound on before we start
more extensive (real-world-ish) testing with this box?

Re: Are we in the ballpark?

От
Greg Smith
Дата:
Wayne Conrad wrote:
> We're building a new database box.  With the help of Gregory Smith's
> book, we're benchmarking the box: We want to know that we've set it up
> right, we want numbers to go back to if we have trouble later, and we
> want something to compare our _next_ box against.

Do you not want any excitement in your life?

> PostgreSQL 8.4.1 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.3.real
> (Debian 4.3.4-2) 4.3.4, 64-bit

8.4.7 is current; there are a lot of useful fixes to be had.  See if you
can get a newer Debian package installed before you go live with this.


> File system: XFS (nobarrier, noatime)

Should probably add "logbufs=8" in there too.


> shared_buffers = 8192MB
> temp_buffers = 16MB
> work_mem = 192MB
> maintenance_work_mem = 5GB
> wal_buffers = 8MB
> checkpoint_segments = 64
> checkpoint_completion_target = 0.9
> random_page_cost = 1.0
> constraint_exclusion = on

That work_mem is a bit on the scary side of things, given how much
memory is allocated to other things.  Just be careful you don't get a
lot of connections and run out of server RAM.

Might as well bump wal_buffers up to 16MB and be done with it.

Setting random_page_cost to 1.0 is essentially telling the server the
entire database is cached in RAM.  If that's not true, you don't want to
go quite that far in reducing it.

With 8.4, you should be able to keep constraint_exclusion at its default
of 'partition' and have that work as expected; any particular reason you
forced it to always be 'on'?

> Bonnie++ (-f -n 0 -c 4)
>   $PGDATA/xlog (RAID1)
>     random seek: 369/sec
>     block out: 87 MB/sec
>     block in: 180 MB/sec
>   $PGDATA (RAID10, 12 drives)
>     random seek: 452
>     block out: 439 MB/sec
>     block in: 881 MB/sec
>
> sysbench test of fsync (commit) rate:
>
>   $PGDATA/xlog (RAID1)
>     cache off: 29 req/sec
>     cache on:  9,342 req/sec
>   $PGDATA (RAID10, 12 drives)
>     cache off: 61 req/sec
>     cache on: 8,191 req/sec

That random seek rate is a little low for 12 drives, but that's probably
the limitations of the 3ware controller kicking in there.  Your "cache
off" figures are really weird though; I'd expect those both to be around
100.  Makes me wonder if something weird is happening in the controller,
or if there was a problem with your config when testing that.  Not a big
deal, really--the cached numbers are normally going to be the important
ones--but it is odd.

Your pgbench SELECT numbers look fine, but particularly given that
commit oddity here I'd recommend running some of the standard TPC-B-like
tests, too, just to be completely sure there's no problem here.  You
should get results that look like "Set 3:  Longer ext3 tests" in the set
I've published to http://www.2ndquadrant.us/pgbench-results/index.htm
presuming you let those run for 10 minutes or so.  The server those came
off of has less RAM and disks than yours, so you'll fit larger database
scales into memory before performance falls off, but that gives you
something to compare against.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


Re: Are we in the ballpark?

От
Wayne Conrad
Дата:
Greg, It's so nice to get a reply from the author of *the book*.  Thank
you for taking the time to help us out.

On 02/01/11 18:30, Greg Smith wrote:
> Do you not want any excitement in your life?

I've had database excitement enough to last a lifetime.  That's why I'm
mending my ways.  Your book is the first step of our 12 step program.

> 8.4.7 is current; there are a lot of useful fixes to be had. See if you
> can get a newer Debian package installed before you go live with this.

I'll look for 8.4.7, but we'll be switching to 9 before too long.

>> File system: XFS (nobarrier, noatime)
>
> Should probably add "logbufs=8" in there too.

Will do.

>> work_mem = 192MB
>> wal_buffers = 8MB
>> random_page_cost = 1.0
>
> That work_mem is a bit on the scary side of things, given how much
> memory is allocated to other things. Just be careful you don't get a lot
> of connections and run out of server RAM.

That's a leftover from the days when we *really* didn't know what we're
doing (now we only *mostly* don't know what we're doing).  I'll set
work_mem down to something less scary.

> Might as well bump wal_buffers up to 16MB and be done with it.

Will do.

> Setting random_page_cost to 1.0 is essentially telling the server the
> entire database is cached in RAM. If that's not true, you don't want to
> go quite that far in reducing it.

Oops, that was a typo.  We've set random_page_cost to 2, not 1.

> With 8.4, you should be able to keep constraint_exclusion at its default
> of 'partition' and have that work as expected; any particular reason you
> forced it to always be 'on'?

See "we really didn't know what we were doing."  We'll leave
constraint_exclusion at its default.

>> Bonnie++ (-f -n 0 -c 4)
>> $PGDATA/xlog (RAID1)
>> random seek: 369/sec
>> block out: 87 MB/sec
>> block in: 180 MB/sec
>> $PGDATA (RAID10, 12 drives)
>> random seek: 452
>> block out: 439 MB/sec
>> block in: 881 MB/sec
>>
>> sysbench test of fsync (commit) rate:
>>
>> $PGDATA/xlog (RAID1)
>> cache off: 29 req/sec
>> cache on: 9,342 req/sec
>> $PGDATA (RAID10, 12 drives)
>> cache off: 61 req/sec
>> cache on: 8,191 req/sec
>
> That random seek rate is a little low for 12 drives, but that's probably
> the limitations of the 3ware controller kicking in there. Your "cache
> off" figures are really weird though; I'd expect those both to be around
> 100. Makes me wonder if something weird is happening in the controller,
> or if there was a problem with your config when testing that. Not a big
> deal, really--the cached numbers are normally going to be the important
> ones--but it is odd.

I also thought the "cache off" figures were odd.  I expected something
much closer to 120 req/sec (7200 rpm drives).  I probably won't
investigate that with any vigor, since the cache-on numbers are OK.

> Your pgbench SELECT numbers look fine, but particularly given that
> commit oddity here I'd recommend running some of the standard TPC-B-like
> tests, too, just to be completely sure there's no problem here. You
> should get results that look like "Set 3: Longer ext3 tests" in the set
> I've published to http://www.2ndquadrant.us/pgbench-results/index.htm
> presuming you let those run for 10 minutes or so. The server those came
> off of has less RAM and disks than yours, so you'll fit larger database
> scales into memory before performance falls off, but that gives you
> something to compare against.

TCB-B-like tests, will do.

Greg, Thanks a million.

     Wayne Conrad


Re: Are we in the ballpark?

От
John Rouillard
Дата:
On Wed, Feb 02, 2011 at 10:06:53AM -0700, Wayne Conrad wrote:
> On 02/01/11 18:30, Greg Smith wrote:
> >>Bonnie++ (-f -n 0 -c 4)
> >>$PGDATA/xlog (RAID1)
> >>random seek: 369/sec
> >>block out: 87 MB/sec
> >>block in: 180 MB/sec
> >>$PGDATA (RAID10, 12 drives)
> >>random seek: 452
> >>block out: 439 MB/sec
> >>block in: 881 MB/sec
> >>
> >>sysbench test of fsync (commit) rate:
> >>
> >>$PGDATA/xlog (RAID1)
> >>cache off: 29 req/sec
> >>cache on: 9,342 req/sec
> >>$PGDATA (RAID10, 12 drives)
> >>cache off: 61 req/sec
> >>cache on: 8,191 req/sec
> >
> >That random seek rate is a little low for 12 drives, but that's probably
> >the limitations of the 3ware controller kicking in there. Your "cache
> >off" figures are really weird though; I'd expect those both to be around
> >100. Makes me wonder if something weird is happening in the controller,
> >or if there was a problem with your config when testing that. Not a big
> >deal, really--the cached numbers are normally going to be the important
> >ones--but it is odd.
>
> I also thought the "cache off" figures were odd.  I expected
> something much closer to 120 req/sec (7200 rpm drives).  I probably
> won't investigate that with any vigor, since the cache-on numbers
> are OK.

You may want to look into the "cache off" figures a little more. We
run a number of battery backed raid controllers and we test the
batteries every 6 months or so. When we test the batteries, the cache
goes off line (as it should) to help keep the data valid.

If you need to test your raid card batteries (nothing like having a
battery with only a 6 hour runtime when it takes you a couple of days
MTTR), can your database app survive with that low a commit rate? As
you said you ar expecting something almost 4-5x faster with 7200 rpm
disks.

--
                -- rouilj

John Rouillard       System Administrator
Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111