Обсуждение: Benchmarking tools, methods

Поиск
Список
Период
Сортировка

Benchmarking tools, methods

От
CSS
Дата:
Hello,

I'm going to be testing some new hardware (see http://archives.postgresql.org/pgsql-performance/2011-11/msg00230.php)
andwhile I've done some very rudimentary before/after tests with pgbench, I'm looking to pull more info than I have in
thepast, and I'd really like to automate things further. 

I'll be starting with basic disk benchmarks (bonnie++ and iozone) and then moving on to pgbench.

I'm running FreeBSD and I'm interested in getting some baseline info on UFS2 single disk (SATA 7200/WD RE4), gmirror,
zfsmirror, zfs raidz1, zfs set of two mirrors (ie: two mirrored vdevs in a mirror).  Then I'm repeating that with the 4
Intel320 SSDs, and just to satisfy my curiosity, a zfs mirror with two of the SSDs mirrored as the ZIL. 

Once that's narrowed down to a few practical choices, I'm moving on to pgbench.  I've found some good info here
regardingpgbench that is unfortunately a bit dated:  http://www.westnet.com/~gsmith/content/postgresql/ 

A few questions:

-Any favorite automation or graphing tools beyond what's on Greg's site?
-Any detailed information on creating "custom" pgbench tests?
-Any other postgres benchmarking tools?

I'm also curious about benchmarking using my own data.  I tried something long ago that at least gave the illusion of
working,but didn't seem quite right to me.  I enabled basic query logging on one of our busier servers, dumped the db,
andlet it run for 24 hours.  That gave me the normal random data from users throughout the day as well as our batch
jobsthat run overnight.  I had to grep out and reformat the actual queries from the logfile, but that was not
difficult.  I then loaded the dump into the test server and basically fed the saved queries into it and timed the
result. I also hacked together a script to sample cpu and disk stats every 2S and had that feeding into an rrd database
soI could see how "busy" things were. 

In theory, this sounded good (to me), but I'm not sure I trust the results.  Any suggestions on the general concept?
Isit sound?  Is there a better way to do it?  I really like the idea of using (our) real data. 

Lastly, any general suggestions on tools to collect system data during tests and graph it are more than welcome.  I can
homebrew,but I'm sure I'd be reinventing the wheel. 

Oh, and if anyone wants any tests run that would not take an insane amount of time and would be valuable to those on
thislist, please let me know.  Since SSDs have been a hot topic lately and not everyone has a 4 SSDs laying around, I'd
liketo sort of focus on anything that would shed some light on the whole SSD craze. 

The box under test ultimately will have 32GB RAM, 2 quad core 2.13GHz Xeon 5506 cpus and 4 Intel 320 160GB SSDs.  I'm
recyclingsome older boxes as well, so I have much more RAM on hand until those are finished. 

Thanks,

Charles

ps - considering the new PostgreSQL Performance book that Packt has, any strong feelings about that one way or the
other? Does it go very far beyond what's on the wiki? 

Re: Benchmarking tools, methods

От
"Tomas Vondra"
Дата:
On 18 Listopad 2011, 10:55, CSS wrote:
> Hello,
>
> I'm going to be testing some new hardware (see
> http://archives.postgresql.org/pgsql-performance/2011-11/msg00230.php) and
> while I've done some very rudimentary before/after tests with pgbench, I'm
> looking to pull more info than I have in the past, and I'd really like to
> automate things further.
>
> I'll be starting with basic disk benchmarks (bonnie++ and iozone) and then
> moving on to pgbench.
>
> I'm running FreeBSD and I'm interested in getting some baseline info on
> UFS2 single disk (SATA 7200/WD RE4), gmirror, zfs mirror, zfs raidz1, zfs
> set of two mirrors (ie: two mirrored vdevs in a mirror).  Then I'm
> repeating that with the 4 Intel 320 SSDs, and just to satisfy my
> curiosity, a zfs mirror with two of the SSDs mirrored as the ZIL.
>
> Once that's narrowed down to a few practical choices, I'm moving on to
> pgbench.  I've found some good info here regarding pgbench that is
> unfortunately a bit dated:
> http://www.westnet.com/~gsmith/content/postgresql/
>
> A few questions:
>
> -Any favorite automation or graphing tools beyond what's on Greg's site?

There are talks not listed on that westnet page - for example a recent
"Bottom-up Database Benchmarking" talk, available for example here:

   http://pgbr.postgresql.org.br/2011/palestras.php?id=60

It probably contains more recent info about benchmarking tools and testing
new hardware.

> -Any detailed information on creating "custom" pgbench tests?

The technical info at
http://www.postgresql.org/docs/9.1/interactive/pgbench.html should be
sufficient I guess, it's fairly simple. The most difficult thing is
determining what the script should do - what queries to execute etc. And
that depends on the application.

> -Any other postgres benchmarking tools?

Not really. The pgbench is a nice stress testing tool and the scripting is
quite flexible. I've done some TPC-H-like testing recently, but it's
rather a bunch of scripts executed manually.

> I'm also curious about benchmarking using my own data.  I tried something
> long ago that at least gave the illusion of working, but didn't seem quite
> right to me.  I enabled basic query logging on one of our busier servers,
> dumped the db, and let it run for 24 hours.  That gave me the normal
> random data from users throughout the day as well as our batch jobs that
> run overnight.  I had to grep out and reformat the actual queries from the
> logfile, but that was not difficult.   I then loaded the dump into the
> test server and basically fed the saved queries into it and timed the
> result.  I also hacked together a script to sample cpu and disk stats
> every 2S and had that feeding into an rrd database so I could see how
> "busy" things were.
>
> In theory, this sounded good (to me), but I'm not sure I trust the
> results.  Any suggestions on the general concept?  Is it sound?  Is there
> a better way to do it?  I really like the idea of using (our) real data.

It's definitely a step in the right direction. An application-specific
benchmark is usually much more useful that a generic stress test. It
simply is going to tell you more about your workload and you can use it to
asses the capacity more precisely.

There are some issues though - mostly about transactions and locking. For
example if the client starts a transaction, locks a bunch of records and
then performs a time-consuming processing task outside the database, the
other clients may be locked. You won't see this during the stress test,
because in reality it looks like this

1) A: BEGIN
2) A: LOCK (table, row, ...)
3) A: perform something expensive
4) B: attempt to LOCK the same resource (blocks)
5) A: release the LOCK
6) B: obtains the LOCK and continues

but when replaying the workload, you'll see this

1) A: BEGIN
2) A: LOCK (table, row, ...)
3) B: attempt to LOCK the same resource (blocks)
4) A: release the LOCK
5) B: obtains the LOCK and continues

so B waits for a very short period of time (or not at all).

To identify this problem, you'd have to actually behave like the client.
For example with a web application, you could use apache bench
(https://httpd.apache.org/docs/2.0/programs/ab.html) or something like
that.

> Lastly, any general suggestions on tools to collect system data during
> tests and graph it are more than welcome.  I can homebrew, but I'm sure
> I'd be reinventing the wheel.

System stats or database stats? There's a plenty of tools for system stats
 (e.g. sar). For database stat's it's a bit more difficult - there's
pgwatch, pgstatspack and maybe some other tools (I've written pg_monitor).

Tomas



Re: Benchmarking tools, methods

От
Cédric Villemain
Дата:
2011/11/18 Tomas Vondra <tv@fuzzy.cz>:
> On 18 Listopad 2011, 10:55, CSS wrote:
>> Hello,
>>
>> I'm going to be testing some new hardware (see
>> http://archives.postgresql.org/pgsql-performance/2011-11/msg00230.php) and
>> while I've done some very rudimentary before/after tests with pgbench, I'm
>> looking to pull more info than I have in the past, and I'd really like to
>> automate things further.
>>
>> I'll be starting with basic disk benchmarks (bonnie++ and iozone) and then
>> moving on to pgbench.
>>
>> I'm running FreeBSD and I'm interested in getting some baseline info on
>> UFS2 single disk (SATA 7200/WD RE4), gmirror, zfs mirror, zfs raidz1, zfs
>> set of two mirrors (ie: two mirrored vdevs in a mirror).  Then I'm
>> repeating that with the 4 Intel 320 SSDs, and just to satisfy my
>> curiosity, a zfs mirror with two of the SSDs mirrored as the ZIL.
>>
>> Once that's narrowed down to a few practical choices, I'm moving on to
>> pgbench.  I've found some good info here regarding pgbench that is
>> unfortunately a bit dated:
>> http://www.westnet.com/~gsmith/content/postgresql/
>>
>> A few questions:
>>
>> -Any favorite automation or graphing tools beyond what's on Greg's site?
>
> There are talks not listed on that westnet page - for example a recent
> "Bottom-up Database Benchmarking" talk, available for example here:
>
>   http://pgbr.postgresql.org.br/2011/palestras.php?id=60
>
> It probably contains more recent info about benchmarking tools and testing
> new hardware.
>
>> -Any detailed information on creating "custom" pgbench tests?
>
> The technical info at
> http://www.postgresql.org/docs/9.1/interactive/pgbench.html should be
> sufficient I guess, it's fairly simple. The most difficult thing is
> determining what the script should do - what queries to execute etc. And
> that depends on the application.
>
>> -Any other postgres benchmarking tools?
>
> Not really. The pgbench is a nice stress testing tool and the scripting is
> quite flexible. I've done some TPC-H-like testing recently, but it's
> rather a bunch of scripts executed manually.
>
>> I'm also curious about benchmarking using my own data.  I tried something
>> long ago that at least gave the illusion of working, but didn't seem quite
>> right to me.  I enabled basic query logging on one of our busier servers,
>> dumped the db, and let it run for 24 hours.  That gave me the normal
>> random data from users throughout the day as well as our batch jobs that
>> run overnight.  I had to grep out and reformat the actual queries from the
>> logfile, but that was not difficult.   I then loaded the dump into the
>> test server and basically fed the saved queries into it and timed the
>> result.  I also hacked together a script to sample cpu and disk stats
>> every 2S and had that feeding into an rrd database so I could see how
>> "busy" things were.
>>
>> In theory, this sounded good (to me), but I'm not sure I trust the
>> results.  Any suggestions on the general concept?  Is it sound?  Is there
>> a better way to do it?  I really like the idea of using (our) real data.
>
> It's definitely a step in the right direction. An application-specific
> benchmark is usually much more useful that a generic stress test. It
> simply is going to tell you more about your workload and you can use it to
> asses the capacity more precisely.
>
> There are some issues though - mostly about transactions and locking. For
> example if the client starts a transaction, locks a bunch of records and
> then performs a time-consuming processing task outside the database, the
> other clients may be locked. You won't see this during the stress test,
> because in reality it looks like this
>
> 1) A: BEGIN
> 2) A: LOCK (table, row, ...)
> 3) A: perform something expensive
> 4) B: attempt to LOCK the same resource (blocks)
> 5) A: release the LOCK
> 6) B: obtains the LOCK and continues
>
> but when replaying the workload, you'll see this
>
> 1) A: BEGIN
> 2) A: LOCK (table, row, ...)
> 3) B: attempt to LOCK the same resource (blocks)
> 4) A: release the LOCK
> 5) B: obtains the LOCK and continues
>
> so B waits for a very short period of time (or not at all).
>
> To identify this problem, you'd have to actually behave like the client.
> For example with a web application, you could use apache bench
> (https://httpd.apache.org/docs/2.0/programs/ab.html) or something like
> that.

I like Tsung: http://tsung.erlang-projects.org/
It is very efficient (you can achieve tens or hundreds of thousands
connections per core)
And you can script scenario in xml (there is also a sql proxy to
record session, and pgfouine as an option to build tsung scenario from
its parsed log).

You can add dynamic stuff in the xml (core function provided by tsung)
and also write your own erland modules to add complexity to your
scenario.
--
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

Re: Benchmarking tools, methods

От
Scott Marlowe
Дата:
On Fri, Nov 18, 2011 at 2:55 AM, CSS <css@morefoo.com> wrote:

> ps - considering the new PostgreSQL Performance book that Packt has, any strong feelings about that one way or the
other? Does it go very far beyond what's on the wiki? 

Since others have provided perfectly good answers to all your other
questions, I'll take this one.  The book is fantastic.  I was a
reviewer for it and had read it all before it was published but still
in rough form.  Got a copy and read most of it all over again.  It's a
must have for postgresql production DBAs.

Re: Benchmarking tools, methods

От
Greg Smith
Дата:
On 11/18/2011 04:55 AM, CSS wrote:
> I'm also curious about benchmarking using my own data.  I tried something long ago that at least gave the illusion of
working,but didn't seem quite right to me.  I enabled basic query logging on one of our busier servers, dumped the db,
andlet it run for 24 hours.  That gave me the normal random data from users throughout the day as well as our batch
jobsthat run overnight.  I had to grep out and reformat the actual queries from the logfile, but that was not
difficult.  I then loaded the dump into the test server and basically fed the saved queries into it and timed the
result. I also hacked together a script to sample cpu and disk stats every 2S and had that feeding into an rrd database
soI could see how "busy" things were. 
>
> In theory, this sounded good (to me), but I'm not sure I trust the results.  Any suggestions on the general concept?
Isit sound?  Is there a better way to do it?  I really like the idea of using (our) real data. 
>

The thing that's hard to do here is replay the activity with the right
timing.  Some benchmarks, such as pgbench, will hit the database as fast
as it will process work.  That's not realistic.  You really need to
consider that real applications have pauses in them, and worry about
that both in playback speed and in results analysis.

See http://wiki.postgresql.org/wiki/Statement_Playback for some more
info on this.

> ps - considering the new PostgreSQL Performance book that Packt has, any strong feelings about that one way or the
other? Does it go very far beyond what's on the wiki? 
>

Pages 21 through 97 are about general benchmarking and hardware setup;
189 through 208 cover just pgbench.  There's almost no overlap between
those sections and the wiki, which is mainly focused on PostgreSQL usage
issues.  Unless you're much smarter than me,  you can expect to spent
months to years reinventing wheels described there before reaching new
ground in the areas it covers.  From the questions you've been asking,
you may not find as much about ZFS tuning and SSDs as you'd like though.

http://www.2ndquadrant.com/en/talks/ has some updated material about
things discovered since the book was published.  The "Bottom-Up Database
Benchmarking" there shows the tests I'm running nowadays, which have
evolved a bit in the last year.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


Re: Benchmarking tools, methods

От
CSS
Дата:
On Nov 19, 2011, at 11:21 AM, Greg Smith wrote:

> On 11/18/2011 04:55 AM, CSS wrote:
>> I'm also curious about benchmarking using my own data.  I tried something long ago that at least gave the illusion
ofworking, but didn't seem quite right to me.  I enabled basic query logging on one of our busier servers, dumped the
db,and let it run for 24 hours.  That gave me the normal random data from users throughout the day as well as our batch
jobsthat run overnight.  I had to grep out and reformat the actual queries from the logfile, but that was not
difficult.  I then loaded the dump into the test server and basically fed the saved queries into it and timed the
result. I also hacked together a script to sample cpu and disk stats every 2S and had that feeding into an rrd database
soI could see how "busy" things were. 
>>
>> In theory, this sounded good (to me), but I'm not sure I trust the results.  Any suggestions on the general concept?
Is it sound?  Is there a better way to do it?  I really like the idea of using (our) real data. 
>>
>
> The thing that's hard to do here is replay the activity with the right timing.  Some benchmarks, such as pgbench,
willhit the database as fast as it will process work.  That's not realistic.  You really need to consider that real
applicationshave pauses in them, and worry about that both in playback speed and in results analysis. 
>
> See http://wiki.postgresql.org/wiki/Statement_Playback for some more info on this.

Thanks so much for this, and thanks to Cédric for also pointing out Tsung specifically on that page.  I had no idea any
ofthese tools existed.  I really like the idea of "application specific" testing, it makes total sense for the kind of
thingswe're trying to measure. 

I also wanted to thank everyone else that posted in this thread, all of this info is tremendously helpful.  This is a
reallyexcellent list, and I really appreciate all the people posting here that make their living doing paid consulting
takingthe time to monitor and post on this list.  Yet another way for me to validate choosing postgres over that
"other"open source db. 


>> ps - considering the new PostgreSQL Performance book that Packt has, any strong feelings about that one way or the
other? Does it go very far beyond what's on the wiki? 
>>
>
> Pages 21 through 97 are about general benchmarking and hardware setup; 189 through 208 cover just pgbench.  There's
almostno overlap between those sections and the wiki, which is mainly focused on PostgreSQL usage issues.  Unless
you'remuch smarter than me,  you can expect to spent months to years reinventing wheels described there before reaching
newground in the areas it covers.  From the questions you've been asking, you may not find as much about ZFS tuning and
SSDsas you'd like though. 

We're grabbing a copy of it for the office.  Packt is running a sale, so we're also going to grab the "cookbook", it
looksintriguing. 

> http://www.2ndquadrant.com/en/talks/ has some updated material about things discovered since the book was published.
The"Bottom-Up Database Benchmarking" there shows the tests I'm running nowadays, which have evolved a bit in the last
year.

Looks like good stuff, thanks.

Charles

> --
> Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
> PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance