Обсуждение: Postgres benchmarking with pgbench

Поиск
Список
Период
Сортировка

Postgres benchmarking with pgbench

От
"ml@bortal.de"
Дата:
Hello List,

i would like to pimp my postgres setup. To make sure i dont have a slow
hardware, i tested it on three diffrent enviorments:
1.) Native Debian Linux (Dom0) and 4GB of RAM
2.) Debian Linux  in Xen (DomU)  4GB of RAM
3.) Blade with SSD Disk 8GB of RAM

Here are my results: http://i39.tinypic.com/24azpxg.jpg

Here is my postgres config: http://pastebin.com/m5e40dbf0

Here is my sysctl:
----------------------------------
kernel.shmmax = 3853361408
kernel.shmall = 99999999

Here is my pgbench benchmark script: http://pastebin.com/m676d0c1b



Here are my hardware details:
========================

On Hardware 1 + 2 (native linux and debian) i have the following
Hardware underneath:

--------------------------------------------------------------------------------------------------------------------------------------------------------
- ARC-1220 8-Port PCI-Express on Raid6 with normal SATA drives 7200RPM
- 1x Quadcore Intel(R) Xeon(R) CPU   E5320  @ 1.86GHz

The Blade:
----------------------------
- 2x 16GB SSD set up in striping mode (Raid 0)
- 2x Quardcore Intel(R) Xeon(R) CPU  E5420  @ 2.50GHz


Any idea why my performance colapses at 2GB Database size?

Thanks,
Mario




Re: Postgres benchmarking with pgbench

От
Greg Smith
Дата:
On Mon, 16 Mar 2009, ml@bortal.de wrote:

> Any idea why my performance colapses at 2GB Database size?

pgbench results follow a general curve I outlined at
http://www.westnet.com/~gsmith/content/postgresql/pgbench-scaling.htm and
the spot where performance drops hard depends on how big of a working set
of data you can hold in RAM.  (That shows a select-only test which is why
the results are so much higher than yours, all the tests work similarly as
far as the curve they trace).

In your case, you've got shared_buffers=1GB, but the rest of the RAM is
the server isn't so useful to you because you've got checkpoint_segments
set to the default of 3.  That means your system is continuously doing
small checkpoints (check your database log files, you'll see what I
meant), which keeps things from ever really using much RAM before
everything has to get forced to disk.

Increase checkpoint_segments to at least 30, and bump your
transactions/client to at least 10,000 while you're at it--the 32000
transactions you're doing right now aren't nearly enough to get good
results from pgbench, 320K is in the right ballpark.  That might be enough
to push your TPS fall-off a bit closer to 4GB, and you'll certainly get
more useful results out of such a longer test.  I'd suggest adding in
scaling factors of 25, 50, and 150, those should let you see the standard
pgbench curve more clearly.

On this topic:  I'm actually doing a talk introducing pgbench use at
tonight's meeting of the Baltimore/Washington PUG, if any readers of this
list are in the area it should be informative:
http://archives.postgresql.org/bwpug/2009-03/msg00000.php and
http://omniti.com/is/here for directions.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Postgres benchmarking with pgbench

От
Gregory Stark
Дата:
Greg Smith <gsmith@gregsmith.com> writes:

> On Mon, 16 Mar 2009, ml@bortal.de wrote:
>
>> Any idea why my performance colapses at 2GB Database size?

I don't understand how you get that graph from the data above. The data above
seems to show your test databases at 1.4GB and 2.9GB. There are no 1GB and 2GB
data points like the graphs show.

Presumably the data point at 2G on the graph should really be at 2.9GB? In
which case I don't find it surprising at all that performance would start to
shift from RAM-resident before that to disk-resident above that. You have 1GB
set aside for shared buffers leaving about 3GB for filesystem cache.

You could try setting shared buffers smaller, perhaps 512kB or larger, perhaps
3.5GB. To minimize the overlap. I would tend to avoid the latter though.

One thing to realize is that pgbench performs a completely flat distribution
of data accesses. So every piece of data is equally likely to be used. In real
life work-loads there are usually some busier and some less busy sections of
the database and the cache tends to keep the hotter data resident even as the
data set grows.

> In your case, you've got shared_buffers=1GB, but the rest of the RAM is the
> server isn't so useful to you because you've got checkpoint_segments set to the
> default of 3.  That means your system is continuously doing small checkpoints
> (check your database log files, you'll see what I meant), which keeps things
> from ever really using much RAM before everything has to get forced to disk.

Why would checkpoints force out any data? It would dirty those pages and then
sync the files marking them clean, but they should still live on in the
filesystem cache.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

Re: Postgres benchmarking with pgbench

От
Greg Smith
Дата:
On Mon, 16 Mar 2009, Gregory Stark wrote:

> Why would checkpoints force out any data? It would dirty those pages and then
> sync the files marking them clean, but they should still live on in the
> filesystem cache.

The bulk of the buffer churn in pgbench is from the statement that updates
a row in the accounts table.  That constantly generates updated data block
and index block pages.  If you can keep those changes in RAM for a while
before forcing them to disk, you can get a lot of benefit from write
coalescing that goes away if constant checkpoints push things out with a
fsync behind them.

Not taking advantage of that effectively reduces the size of the OS cache,
because you end up with a lot of space holding pending writes that
wouldn't need to happen at all yet were the checkpoints spaced out better.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Postgres benchmarking with pgbench

От
Gregory Stark
Дата:
Greg Smith <gsmith@gregsmith.com> writes:

> On Mon, 16 Mar 2009, Gregory Stark wrote:
>
>> Why would checkpoints force out any data? It would dirty those pages and then
>> sync the files marking them clean, but they should still live on in the
>> filesystem cache.
>
> The bulk of the buffer churn in pgbench is from the statement that updates a
> row in the accounts table.  That constantly generates updated data block and
> index block pages.  If you can keep those changes in RAM for a while before
> forcing them to disk, you can get a lot of benefit from write coalescing that
> goes away if constant checkpoints push things out with a fsync behind them.
>
> Not taking advantage of that effectively reduces the size of the OS cache,
> because you end up with a lot of space holding pending writes that wouldn't
> need to happen at all yet were the checkpoints spaced out better.

Ok, so it's purely a question of write i/o, not reduced cache effectiveness.
I think I could see that. I would be curious to see these results with a
larger checkpoint_segments setting.

Looking further at the graphs I think they're broken but not in the way I had
guessed. It looks like they're *overstating* the point at which the drop
occurs. Looking at the numbers it's clear that under 1GB performs well but at
1.5GBP it's already dropping to the disk-resident speed.

I think pgbench is just not that great a model for real-world usage . a) most
real world workloads are limited by read traffic, not write traffic, and
certainly not random update write traffic; and b) most real-world work loads
follow a less uniform distribution so keeping busy records and index regions
in memory is more effective.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

Re: Postgres benchmarking with pgbench

От
Greg Smith
Дата:
On Tue, 17 Mar 2009, Gregory Stark wrote:

> I think pgbench is just not that great a model for real-world usage

pgbench's default workload isn't a good model for anything.  It wasn't a
particularly real-world test when the TPC-B it's based on was created, and
that was way back in 1990.  And pgbench isn't even a good implementation
of that spec (the rows are too narrow, comments about that in pgbench.c).

At this point, the only good thing you can say about pgbench is that it's
been a useful tool for comparing successive releases of PostgreSQL in a
relatively fair way.  Basically, it measures what pgbench measures, and
that has only a loose relationship with what people want a database to do.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Postgres benchmarking with pgbench

От
Scott Marlowe
Дата:
On Mon, Mar 16, 2009 at 10:17 PM, Greg Smith <gsmith@gregsmith.com> wrote:
> On Tue, 17 Mar 2009, Gregory Stark wrote:
>
>> I think pgbench is just not that great a model for real-world usage
>
> pgbench's default workload isn't a good model for anything.  It wasn't a
> particularly real-world test when the TPC-B it's based on was created, and
> that was way back in 1990.  And pgbench isn't even a good implementation of
> that spec (the rows are too narrow, comments about that in pgbench.c).
>
> At this point, the only good thing you can say about pgbench is that it's
> been a useful tool for comparing successive releases of PostgreSQL in a
> relatively fair way.  Basically, it measures what pgbench measures, and that
> has only a loose relationship with what people want a database to do.

I'd say pgbench is best in negative.  I.e it can't tell you a database
server is gonna be fast, but it can usually tell you when something's
horrifically wrong.  If I just installed a new external storage array
of some kind and I'm getting 6 tps, something is wrong somewhere.

And it's good for exercising your disk farm for a week during burn in.
 It certainly turned up a bad RAID card last fall during acceptance
testing our new servers.  Took 36 hours of pgbench to trip the bug and
cause the card to lock up.  Had one bad disk drive too that pgbench
killed of for me.

Re: Postgres benchmarking with pgbench

От
"ml@bortal.de"
Дата:
Hi Greg,

thanks a lot for your hints. I changed my config and changed raid6 to
raid10, but whatever i do, the benchmark breaks down at a scaling factor
75 where the database is "only" 1126MB big.

Here are my benchmark Results (scaling factor, DB size in MB, TPS) using:
   pgbench -S -c  X  -t 1000 -U pgsql -d benchmark -h MYHOST

1 19 8600
5 79 8743
10 154 8774
20 303 8479
30 453 8775
40 602 8093
50 752 6334
75 1126 3881
150 2247 2297
200 2994 701
250 3742 656
300 4489 596
400 5984 552
500 7479 513

I have no idea if this is any good for a QuardCore Intel(R) Xeon(R) CPU
E5320  @ 1.86GHz with 4GB Ram and 6 SATA disk (7200rpm) in raid 10.

Here is my config (maybe with some odd setting):
http://pastebin.com/m5d7f5717

I played around with:
- max_connections
- shared_buffers
- work_mem
- maintenance_work_mem
- checkpoint_segments
- effective_cache_size

..but whatever i do, the graph looks the same. Any hints or tips what my
config should look like? Or are these results even okay? Maybe i am
driving myself crazy for nothing?

Cheers,
Mario


Greg Smith wrote:
> On Mon, 16 Mar 2009, ml@bortal.de wrote:
>
>> Any idea why my performance colapses at 2GB Database size?
>
> pgbench results follow a general curve I outlined at
> http://www.westnet.com/~gsmith/content/postgresql/pgbench-scaling.htm
> and the spot where performance drops hard depends on how big of a
> working set of data you can hold in RAM.  (That shows a select-only
> test which is why the results are so much higher than yours, all the
> tests work similarly as far as the curve they trace).
>
> In your case, you've got shared_buffers=1GB, but the rest of the RAM
> is the server isn't so useful to you because you've got
> checkpoint_segments set to the default of 3.  That means your system
> is continuously doing small checkpoints (check your database log
> files, you'll see what I meant), which keeps things from ever really
> using much RAM before everything has to get forced to disk.
>
> Increase checkpoint_segments to at least 30, and bump your
> transactions/client to at least 10,000 while you're at it--the 32000
> transactions you're doing right now aren't nearly enough to get good
> results from pgbench, 320K is in the right ballpark.  That might be
> enough to push your TPS fall-off a bit closer to 4GB, and you'll
> certainly get more useful results out of such a longer test.  I'd
> suggest adding in scaling factors of 25, 50, and 150, those should let
> you see the standard pgbench curve more clearly.
>
> On this topic:  I'm actually doing a talk introducing pgbench use at
> tonight's meeting of the Baltimore/Washington PUG, if any readers of
> this list are in the area it should be informative:
> http://archives.postgresql.org/bwpug/2009-03/msg00000.php and
> http://omniti.com/is/here for directions.
>
> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
>


Re: Postgres benchmarking with pgbench

От
Scott Marlowe
Дата:
On Thu, Mar 19, 2009 at 3:25 PM, ml@bortal.de <ml@bortal.de> wrote:
> Hi Greg,
>
> thanks a lot for your hints. I changed my config and changed raid6 to
> raid10, but whatever i do, the benchmark breaks down at a scaling factor 75
> where the database is "only" 1126MB big.
>
> Here are my benchmark Results (scaling factor, DB size in MB, TPS) using:
>  pgbench -S -c  X  -t 1000 -U pgsql -d benchmark -h MYHOST

-t 1000 is WAY too short to judge, you'll be seeing a lot of caching
effects and no WAL flushing.  Try a setting that gets you a run of at
least 5 or 10 minutes, preferably a half an hour for more realistic
results.  Also what is -c X ???  Are you following the -c with the
same scaling factor that you used to create the test db?  And why the
select only (-S)???

> 1 19 8600
> 5 79 8743
> 10 154 8774
> 20 303 8479
> 30 453 8775
> 40 602 8093
> 50 752 6334
> 75 1126 3881
> 150 2247 2297
> 200 2994 701
> 250 3742 656
> 300 4489 596
> 400 5984 552
> 500 7479 513
>
> I have no idea if this is any good for a QuardCore Intel(R) Xeon(R) CPU
>  E5320  @ 1.86GHz with 4GB Ram and 6 SATA disk (7200rpm) in raid 10.
>
> Here is my config (maybe with some odd setting):
> http://pastebin.com/m5d7f5717
>
> I played around with:
> - max_connections
> - shared_buffers

You've got this set to 1/2 your memory (2G).  I've found that for
transactional work it's almost always better to set this much lower
and let the OS do the caching, especially once your db is too big to
fit in memory.  Try setting lowering it and see what happens to your
performance envelope.

> - work_mem
> - maintenance_work_mem
> - checkpoint_segments
> - effective_cache_size

This is set to 3G, but with shared_mem set to 2G, you can't cache more
than 2G, because the OS will just be caching the same stuff as pgsql,
or less.  No biggie.  Just not realistic

> ..but whatever i do, the graph looks the same. Any hints or tips what my
> config should look like? Or are these results even okay? Maybe i am driving
> myself crazy for nothing?

Could be.   What do top and vmstat say during your test run?

Re: Postgres benchmarking with pgbench

От
Scott Carey
Дата:
On 3/19/09 2:25 PM, "ml@bortal.de" <ml@bortal.de> wrote:

>
> Here is my config (maybe with some odd setting):
> http://pastebin.com/m5d7f5717
>
> I played around with:
> - max_connections
> - shared_buffers
> - work_mem
> - maintenance_work_mem
> - checkpoint_segments
> - effective_cache_size
>
> ..but whatever i do, the graph looks the same. Any hints or tips what my
> config should look like? Or are these results even okay? Maybe i am
> driving myself crazy for nothing?
>
> Cheers,
> Mario
>

I'm assuming this is linux: What linux version?  What is your kernel's
dirty_ratio and background_dirty_ratio?

The default for a long time was 40 and 10, respectively.  This is far too
large for most uses on today's servers, you would not want 40% of your RAM
to have pages not yet flushed to disk except perhaps on a small workstation.
See
Current kernels default to 10 and 5, which is better.

What is best for your real life workload will differ from pg_bench here.
I don't know if this is the cause for any of your problems, but it is
related closely to the checkpoint_segments and checkpoint size/time
configuration.

Is your xlog on the same device as the data?  I have found that for most
real world workloads, having the xlog on a separate device helps
tremendously.  Even more so for 'poor' RAID controllers like the PERC5 --
your sync writes in xlog will be interfering with the RAID controller cache
of your data due to bad design.
But my pg_bench knowledge with respect to this is limited.


Re: Postgres benchmarking with pgbench

От
"Jignesh K. Shah"
Дата:

ml@bortal.de wrote:
> Hi Greg,
>
> thanks a lot for your hints. I changed my config and changed raid6 to
> raid10, but whatever i do, the benchmark breaks down at a scaling
> factor 75 where the database is "only" 1126MB big.
>
> Here are my benchmark Results (scaling factor, DB size in MB, TPS) using:
>   pgbench -S -c  X  -t 1000 -U pgsql -d benchmark -h MYHOST
>
> 1 19 8600
> 5 79 8743
> 10 154 8774
> 20 303 8479
> 30 453 8775
> 40 602 8093
> 50 752 6334
> 75 1126 3881
> 150 2247 2297
> 200 2994 701
> 250 3742 656
> 300 4489 596
> 400 5984 552
> 500 7479 513
>
> I have no idea if this is any good for a QuardCore Intel(R) Xeon(R)
> CPU  E5320  @ 1.86GHz with 4GB Ram and 6 SATA disk (7200rpm) in raid 10.
>
> Here is my config (maybe with some odd setting):
> http://pastebin.com/m5d7f5717
>
> I played around with:
> - max_connections
> - shared_buffers
> - work_mem
> - maintenance_work_mem
> - checkpoint_segments
> - effective_cache_size
>
> ..but whatever i do, the graph looks the same. Any hints or tips what
> my config should look like? Or are these results even okay? Maybe i am
> driving myself crazy for nothing?
>
Are you running the pgbench client from a different system? Did you
check if the pgbench client itself is bottlenecked or not. I have seen
before the client of pgbench is severely limited on the load it can
drive and process.


-Jignesh