Обсуждение: Postgres Benchmark Results

Поиск

Список

Период

Сортировка

Postgres Benchmark Results

От

PFC

Дата:

20 мая 2007 г., 14:56:36

    I felt the world needed a new benchmark ;)
    So : Forum style benchmark with simulation of many users posting and
viewing forums and topics on a PHP website.

    http://home.peufeu.com/ftsbench/forum1.png

    One of those curves is "a very popular open-source database which claims
to offer unparallelled speed".
    The other one is of course Postgres 8.2.3 which by popular belief is
"full-featured but slow"

    What is your guess ?

Re: Postgres Benchmark Results

От

Arjen van der Meijden

Дата:

20 мая 2007 г., 16:41:39

I assume red is PostgreSQL and green is MySQL. That reflects my own
benchmarks with those two.

But I don't fully understand what the graph displays. Does it reflect
the ability of the underlying database to support a certain amount of
users per second given a certain database size? Or is the growth of the
database part of the benchmark?

Btw, did you consider that older topics are normally read much less and
almost never get new postings? I think the size of the "active data set"
is more dependent on the amount of active members than on the actual
amount of data available.
That can reduce the impact of the size of the database greatly, although
we saw very nice gains in performance on our forum (over 22GB of
messages) when replacing the databaseserver with one with twice the
memory, cpu's and I/O.

Best regards,

Arjen

On 20-5-2007 16:58 PFC wrote:
>
>     I felt the world needed a new benchmark ;)
>     So : Forum style benchmark with simulation of many users posting and
> viewing forums and topics on a PHP website.
>
>     http://home.peufeu.com/ftsbench/forum1.png
>
>     One of those curves is "a very popular open-source database which
> claims to offer unparallelled speed".
>     The other one is of course Postgres 8.2.3 which by popular belief is
> "full-featured but slow"
>
>     What is your guess ?
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

Re: Postgres Benchmark Results

От

PFC

Дата:

20 мая 2007 г., 17:07:27

> I assume red is PostgreSQL and green is MySQL. That reflects my own
> benchmarks with those two.

    Well, since you answered first, and right, you win XD

    The little curve that dives into the ground is MySQL with InnoDB.
    The Energizer bunny that keeps going is Postgres.

> But I don't fully understand what the graph displays. Does it reflect
> the ability of the underlying database to support a certain amount of
> users per second given a certain database size? Or is the growth of the
> database part of the benchmark?

    Basically I have a test client which simulates a certain number of
concurrent users browsing a forum, and posting (posting rate is
artificially high in order to fill the tables quicker than the months it
would take in real life).

    Since the fake users pick which topics to view and post in by browsing
the pages, like people would do, it tends to pick the topics in the first
few pages of the forum, those with the most recent posts. So, like in real
life, some topics fall through the first pages, and go down to rot at the
bottom, while others grow much more.

    So, as the database grows (X axis) ; the total number of webpages served
per second (viewings + postings) is on the Y axis, representing the user's
experience (fast / slow / dead server)

    The number of concurrent HTTP or Postgres connections is not plotted, it
doesn't really matter anyway for benchmarking purposes, you need to have
enough to keep the server busy, but not too much or you're just wasting
RAM. For a LAN that's about 30 HTTP connections and about 8 PHP processes
with each a database connection.
    Since I use lighttpd, I don't really care about the number of actual slow
clients (ie. real concurrent HTTP connections). Everything is funneled
through those 8 PHP processes, so postgres never sees huge concurrency.
    About 2/3 of the CPU is used by PHP anyway, only 1/3 by Postgres ;)

> Btw, did you consider that older topics are normally read much less and
> almost never get new postings? I think the size of the "active data set"
> is more dependent on the amount of active members than on the actual
> amount of data available.

    Yes, see above.
    The posts table is clustered on (topic_id, post_id) and this is key to
performance.

> That can reduce the impact of the size of the database greatly, although
> we saw very nice gains in performance on our forum (over 22GB of
> messages) when replacing the databaseserver with one with twice the
> memory, cpu's and I/O.

    Well, you can see on the curve when it hits IO-bound behaviour.

    I'm writing a full report, but I'm having a lot of problems with MySQL,
I'd like to give it a fair chance, but it shows real obstination in NOT
working.

Re: Postgres Benchmark Results

От

Arjen van der Meijden

Дата:

20 мая 2007 г., 17:26:38

On 20-5-2007 19:09 PFC wrote:
>     Since I use lighttpd, I don't really care about the number of actual
> slow clients (ie. real concurrent HTTP connections). Everything is
> funneled through those 8 PHP processes, so postgres never sees huge
> concurrency.

Well, that would only be in favour of postgres anyway, it scales in our
benchmarks better to multiple cpu's, multiple clients and appaerantly in
yours to larger datasets. MySQL seems to be faster up untill a certain
amount of concurrent clients (close to the amount of cpu's available)
and beyond that can collapse dramatically.

>     I'm writing a full report, but I'm having a lot of problems with
> MySQL, I'd like to give it a fair chance, but it shows real obstination
> in NOT working.

Yeah, it displayed very odd behaviour when doing benchmarks here too. If
you haven't done already, you can try the newest 5.0-verion (5.0.41?)
which eliminates several scaling issues in InnoDB, but afaik not all of
them. Besides that, it just can be pretty painful to get a certain query
fast, although we've not very often seen it failing completely in the
last few years.

Best regards,

Arjen van der Meijden

Re: Postgres Benchmark Results

От

Tom Lane

Дата:

20 мая 2007 г., 17:26:45

PFC <lists@peufeu.com> writes:
>     The little curve that dives into the ground is MySQL with InnoDB.
>     The Energizer bunny that keeps going is Postgres.

Just for comparison's sake it would be interesting to see a curve for
mysql/myisam.  Mysql's claim to speed is mostly based on measurements
taken with myisam tables, but I think that doesn't hold up very well
under concurrent load.

            regards, tom lane

Re: Postgres Benchmark Results

От

Zoltan Boszormenyi

Дата:

20 мая 2007 г., 18:00:40

PFC írta:
>
>     I felt the world needed a new benchmark ;)
>     So : Forum style benchmark with simulation of many users posting
> and viewing forums and topics on a PHP website.
>
>     http://home.peufeu.com/ftsbench/forum1.png
>
>     One of those curves is "a very popular open-source database which
> claims to offer unparallelled speed".
>     The other one is of course Postgres 8.2.3 which by popular belief
> is "full-featured but slow"
>
>     What is your guess ?

Red is PostgreSQL.

The advertised "unparallelled speed" must surely mean
benchmarking only single-client access on the noname DB. ;-)

I also went into benchmarking mode last night for my own
amusement when I read on the linux-kernel ML that
NCQ support for nForce5 chips was released.
I tried current PostgreSQL 8.3devel CVS.
pgbench over local TCP connection with
25 clients and 3000 transacts/client gave me
around 445 tps before applying NCQ support.
680 tps after.

It went over 840 tps after adding HOT v7 patch,
still with 25 clients. It topped at 1062 tps with 3-4 clients.
I used a single Seagate 320GB SATA2 drive
for the test, which only has less than 40GB free.
So it's already at the end of the disk giving smaller
transfer rates then at the beginning. Filesystem is ext3.
Dual core Athlon64 X2 4200 in 64-bit mode.
I have never seen such a performance before
on a desktop machine.

--
----------------------------------
Zoltán Böszörményi
Cybertec Geschwinde & Schönig GmbH
http://www.postgresql.at/

Re: Postgres Benchmark Results

От

PFC

Дата:

20 мая 2007 г., 18:08:08

On Sun, 20 May 2007 19:26:38 +0200, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> PFC <lists@peufeu.com> writes:
>>     The little curve that dives into the ground is MySQL with InnoDB.
>>     The Energizer bunny that keeps going is Postgres.
>
> Just for comparison's sake it would be interesting to see a curve for
> mysql/myisam.  Mysql's claim to speed is mostly based on measurements
> taken with myisam tables, but I think that doesn't hold up very well
> under concurrent load.
>
>             regards, tom lane


    I'm doing that now. Here is what I wrote in the report :

    Using prepared statements (important), Postgres beats MyISAM on "simple
selects" as they say, as well as complex selects, even with 1 thread.

    MyISAM caused massive data corruption : posts and topics disappear,
storage engine errors pop off, random thrashed rows appear in the forums
table, therefore screwing up everything, etc. In short : it doesn't work.
But, since noone in their right mind would use MyISAM for critical data, I
include this result anyway, as a curiosity.

    I had to write a repair SQL script to fix the corruption in order to see
how MySQL will fare when it gets bigger than RAM...

Re: Postgres Benchmark Results

От

"Andreas Kostyrka"

Дата:

20 мая 2007 г., 18:49:25

>    I'm writing a full report, but I'm having a
> lot of problems with MySQL,
> I'd like to give it a fair chance, but it shows
> real obstination in NOT
> working.

Well that matches up well with my experience, better even yet, file a performance bug to the commercial support and
you'llget an explanation why your schema (or your hardware, well anything but the database software used) is the guilty
factor.

but you know these IT manager journals consider mysql as the relevant opensource database. Guess it matches better with
theirexpection than PG or say MaxDB (the artist known formerly as Sap DB). 

Andreas


---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: Postgres Benchmark Results

От

"Jim C. Nasby"

Дата:

21 мая 2007 г., 21:01:35

On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote:
> I also went into benchmarking mode last night for my own
> amusement when I read on the linux-kernel ML that
> NCQ support for nForce5 chips was released.
> I tried current PostgreSQL 8.3devel CVS.
> pgbench over local TCP connection with
> 25 clients and 3000 transacts/client gave me
> around 445 tps before applying NCQ support.
> 680 tps after.
>
> It went over 840 tps after adding HOT v7 patch,
> still with 25 clients. It topped at 1062 tps with 3-4 clients.
> I used a single Seagate 320GB SATA2 drive
> for the test, which only has less than 40GB free.
> So it's already at the end of the disk giving smaller
> transfer rates then at the beginning. Filesystem is ext3.
> Dual core Athlon64 X2 4200 in 64-bit mode.
> I have never seen such a performance before
> on a desktop machine.

I'd be willing to bet money that the drive is lying about commits/fsync.
Each transaction committed essentially requires one revolution of the
drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.

BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option
data=writeback. Note that doing that probably has a negative impact on
data recovery after a crash for non-database files.
--
Jim Nasby                                      decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: Postgres Benchmark Results

От

"Jim C. Nasby"

Дата:

21 мая 2007 г., 21:05:37

On Sun, May 20, 2007 at 04:58:45PM +0200, PFC wrote:
>
>     I felt the world needed a new benchmark ;)
>     So : Forum style benchmark with simulation of many users posting and
> viewing forums and topics on a PHP website.
>
>     http://home.peufeu.com/ftsbench/forum1.png

Any chance of publishing your benchmark code so others can do testing?
It sounds like a useful, well-thought-out benchmark (even if it is
rather specialized).

Also, I think it's important for you to track how long it takes to
respond to requests, both average and maximum. In a web application no
one's going to care if you're doing 1000TPS if it means that every time
you click on something it takes 15 seconds to get the next page back.
With network round-trip times and what-not considered I'd say you don't
want it to take any more than 200-500ms between when a request hits a
webserver and when the last bit of data has gone back to the client.

I'm guessing that there's about 600MB of memory available for disk
caching? (Well, 600MB minus whatever shared_buffers is set to).
--
Jim Nasby                                      decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

Re: Postgres Benchmark Results

От

PFC

Дата:

21 мая 2007 г., 21:33:10

> Well that matches up well with my experience, better even yet, file a
> performance bug to the commercial support and you'll get an explanation
> why your schema (or your hardware, well anything but the database
> software used) is the guilty factor.

    Yeah, I filed a bug last week since REPEATABLE READ isn't repeatable : it
works for SELECT but INSERT INTO ... SELECT switches to READ COMMITTED and
thus does not insert the same rows that the same SELECT would have
returned.

> but you know these IT manager journals consider mysql as the relevant
> opensource database. Guess it matches better with their expection than
> PG or say MaxDB (the artist known formerly as Sap DB).

    Never tried MaxDB.

    So far, my MyISAM benchmarks show that, while on the CPU limited case,
Postgres is faster (even on small simple selects) , when the dataset grows
larger, MyISAM keeps going much better than Postgres. That was to be
expected since the tables are more compact, it can read indexes without
hitting the tables, and of course it doesn't have transaction overhead.

    However, these good results are slightly mitigated by the massive data
corruption and complete mayhem that ensues, either from "transactions"
aborting mid-way, that can't be rolled back obviously, leaving stuff with
broken relations, or plain simple engine bugs which replace your data with
crap. After about 1/2 hour of hitting the tables hard, they start to
corrupt and you get cryptic error messages. Fortunately "REPAIR TABLE"
provides good consolation in telling you how much corrupt data it had to
erase from your table... really reassuring !

    I believe the following current or future Postgres features will provide
an interesting answer to MyISAM :

    - The fact that it doesn't corrupt your data, duh.
    - HOT
    - the new non-logged tables
    - Deferred Transactions, since adding a comment to a blog post doesn't
need the same guarantees than submitting a paid order, it makes sense that
the application could tell postgres which transactions we care about if
power is lost. This will massively boost performance for websites I
believe.
    - the patch that keeps tables in approximate cluster order

    By the way, about the ALTER TABLE SET PERSISTENCE ... for non-logged
tables, will we get an ON RECOVER trigger ?
    For instance, I have counts tables that are often updated by triggers. On
recovery, I could simply re-create the counts from the actual data. So I
could use the extra speed of non-crash proof tables.

>
> Andreas
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faq
>

Re: Postgres Benchmark Results

От

PFC

Дата:

21 мая 2007 г., 21:37:19

On Mon, 21 May 2007 23:05:22 +0200, Jim C. Nasby <decibel@decibel.org>
wrote:

> On Sun, May 20, 2007 at 04:58:45PM +0200, PFC wrote:
>>
>>     I felt the world needed a new benchmark ;)
>>     So : Forum style benchmark with simulation of many users posting and
>> viewing forums and topics on a PHP website.
>>
>>     http://home.peufeu.com/ftsbench/forum1.png
>
> Any chance of publishing your benchmark code so others can do testing?
> It sounds like a useful, well-thought-out benchmark (even if it is
> rather specialized).

    Yes, that was the intent from the start.
    It is specialized, because forums are one of the famous server killers.
This is mostly due to bad database design, bad PHP skills, and the
horrendous MySQL FULLTEXT.
    I'll have to clean up the code and document it for public consumption,
though.
    However, the Python client is too slow. It saturates at about 1000 hits/s
on a Athlon 64 3000+, so you can forget about benchmarking anything meaner
than a Core 2 duo.

> Also, I think it's important for you to track how long it takes to
> respond to requests, both average and maximum. In a web application no
> one's going to care if you're doing 1000TPS if it means that every time
> you click on something it takes 15 seconds to get the next page back.
> With network round-trip times and what-not considered I'd say you don't
> want it to take any more than 200-500ms between when a request hits a
> webserver and when the last bit of data has gone back to the client.

    Yeah, I will do that too.

> I'm guessing that there's about 600MB of memory available for disk
> caching? (Well, 600MB minus whatever shared_buffers is set to).

    It's about that. The machine has 1 GB of RAM.

Re: Postgres Benchmark Results

От

Guido Neitzer

Дата:

21 мая 2007 г., 23:45:14

Am 21.05.2007 um 15:01 schrieb Jim C. Nasby:

> I'd be willing to bet money that the drive is lying about commits/
> fsync.
> Each transaction committed essentially requires one revolution of the
> drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.

Yes, that right, but if a lot of the transactions are selects, there
is no entry in the x_log for them and most of the stuff can come from
the cache - read from memory which is blazing fast compared to any
disk ... And this was a pg_bench test - I don't know what the
benchmark really does but if I remember correctly it is mostly reading.

cug

Re: Postgres Benchmark Results

От

Rich

Дата:

22 мая 2007 г., 00:05:32

I assume red is the postgresql. AS you add connections, Mysql always dies.

On 5/20/07, PFC <lists@peufeu.com> wrote:

        I felt the world needed a new benchmark ;)
        So : Forum style benchmark with simulation of many users posting and
viewing forums and topics on a PHP website.

        http://home.peufeu.com/ftsbench/forum1.png

        One of those curves is "a very popular open-source database which claims
to offer unparallelled speed".
        The other one is of course Postgres 8.2.3 which by popular belief is
"full-featured but slow"

        What is your guess ?

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: Postgres Benchmark Results

От

Scott Marlowe

Дата:

22 мая 2007 г., 00:28:35

Jim C. Nasby wrote:
> On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote:
>
>> I also went into benchmarking mode last night for my own
>> amusement when I read on the linux-kernel ML that
>> NCQ support for nForce5 chips was released.
>> I tried current PostgreSQL 8.3devel CVS.
>> pgbench over local TCP connection with
>> 25 clients and 3000 transacts/client gave me
>> around 445 tps before applying NCQ support.
>> 680 tps after.
>>
>> It went over 840 tps after adding HOT v7 patch,
>> still with 25 clients. It topped at 1062 tps with 3-4 clients.
>> I used a single Seagate 320GB SATA2 drive
>> for the test, which only has less than 40GB free.
>> So it's already at the end of the disk giving smaller
>> transfer rates then at the beginning. Filesystem is ext3.
>> Dual core Athlon64 X2 4200 in 64-bit mode.
>> I have never seen such a performance before
>> on a desktop machine.
>>
>
> I'd be willing to bet money that the drive is lying about commits/fsync.
> Each transaction committed essentially requires one revolution of the
> drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.
>
> BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option
> data=writeback. Note that doing that probably has a negative impact on
> data recovery after a crash for non-database files.
>

I thought you were limited to 250 or so COMMITS to disk per second, and
since >1 client can be committed at once, you could do greater than 250
tps, as long as you had >1 client providing input.  Or was I wrong?

Re: Postgres Benchmark Results

От

Alvaro Herrera

Дата:

22 мая 2007 г., 00:49:04

Scott Marlowe wrote:
> Jim C. Nasby wrote:
> >On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote:
> >
> >>I also went into benchmarking mode last night for my own
> >>amusement when I read on the linux-kernel ML that
> >>NCQ support for nForce5 chips was released.
> >>I tried current PostgreSQL 8.3devel CVS.
> >>pgbench over local TCP connection with
> >>25 clients and 3000 transacts/client gave me
> >>around 445 tps before applying NCQ support.
> >>680 tps after.
> >>
> >>It went over 840 tps after adding HOT v7 patch,
> >>still with 25 clients. It topped at 1062 tps with 3-4 clients.
> >>I used a single Seagate 320GB SATA2 drive
> >>for the test, which only has less than 40GB free.
> >>So it's already at the end of the disk giving smaller
> >>transfer rates then at the beginning. Filesystem is ext3.
> >>Dual core Athlon64 X2 4200 in 64-bit mode.
> >>I have never seen such a performance before
> >>on a desktop machine.
> >>
> >
> >I'd be willing to bet money that the drive is lying about commits/fsync.
> >Each transaction committed essentially requires one revolution of the
> >drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.
> >
> >BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option
> >data=writeback. Note that doing that probably has a negative impact on
> >data recovery after a crash for non-database files.
> >
>
> I thought you were limited to 250 or so COMMITS to disk per second, and
> since >1 client can be committed at once, you could do greater than 250
> tps, as long as you had >1 client providing input.  Or was I wrong?

My impression is that you are correct in theory -- this is the "commit
delay" feature.  But it seems that the feature does not work as well as
one would like; and furthermore, it is disabled by default.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: Postgres Benchmark Results

От

Greg Smith

Дата:

22 мая 2007 г., 05:51:50

On Mon, 21 May 2007, Guido Neitzer wrote:

> Yes, that right, but if a lot of the transactions are selects, there is no
> entry in the x_log for them and most of the stuff can come from the cache -
> read from memory which is blazing fast compared to any disk ... And this was
> a pg_bench test - I don't know what the benchmark really does but if I
> remember correctly it is mostly reading.

The standard pgbench transaction includes a select, an insert, and three
updates.  All five finished equals one transaction; the fact that the
SELECT statment in there could be executed much faster where it to happen
on its own doesn't matter.

Because it does the most work on the biggest table, the entire combination
is usually driven mostly by how long the UPDATE to the accounts table
takes.  The TPS numbers can certainly be no larger than the rate at which
you can execute that.

As has been pointed out, every time you commit a transacation the disk has
to actually write that out before it's considered complete.  Unless you
have a good caching disk controller (which your nForce5 is not) you're
limited to 120 TPS with a 7200RPM drive and 250 with a 15000 RPM one.
While it's possible to improve slightly on this using the commit_delay
feature, I haven't been able to replicate even a 100% improvement that way
when running pgbench, and to get even close to that level of improvement
would require a large number of clients.

Unless you went out of your way to turn it off, your drive is caching
writes; every Seagate SATA drive I've ever seen does by default.  "1062
tps with 3-4 clients" just isn't possible with your hardware otherwise.
If you turn that feature off with:

hdparm -W0 /dev/hda (might be /dev/sda with the current driver)

that will disable the disk caching and you'll be reporting accurate
numbers--which will be far lower than you're seeing now.

While your results are an interesting commentary on how fast the system
can run when it has a write cache available, and the increase with recent
code is interesting, your actual figures here are a fantasy.  The database
isn't working properly and a real system using this hardware would be
expected to become corrupted if ran for long enough.  I have a paper at
http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm you
might want to read that goes into more detail than you probably want to
know on this subject if you're like to read more about it--and you really,
really should if you intend to put important data into a PostgreSQL
database.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Postgres Benchmark Results

От

Guido Neitzer

Дата:

22 мая 2007 г., 06:01:21

Am 21.05.2007 um 23:51 schrieb Greg Smith:

> The standard pgbench transaction includes a select, an insert, and
> three updates.

I see. Didn't know that, but it makes sense.

> Unless you went out of your way to turn it off, your drive is
> caching writes; every Seagate SATA drive I've ever seen does by
> default.  "1062 tps with 3-4 clients" just isn't possible with your
> hardware otherwise.

Btw: it wasn't my hardware in this test!

cug

Re: Postgres Benchmark Results

От

Zoltan Boszormenyi

Дата:

22 мая 2007 г., 06:15:00

Jim C. Nasby írta:
> On Sun, May 20, 2007 at 08:00:25PM +0200, Zoltan Boszormenyi wrote:
>
>> I also went into benchmarking mode last night for my own
>> amusement when I read on the linux-kernel ML that
>> NCQ support for nForce5 chips was released.
>> I tried current PostgreSQL 8.3devel CVS.
>> pgbench over local TCP connection with
>> 25 clients and 3000 transacts/client gave me
>> around 445 tps before applying NCQ support.
>> 680 tps after.
>>
>> It went over 840 tps after adding HOT v7 patch,
>> still with 25 clients. It topped at 1062 tps with 3-4 clients.
>> I used a single Seagate 320GB SATA2 drive
>> for the test, which only has less than 40GB free.
>> So it's already at the end of the disk giving smaller
>> transfer rates then at the beginning. Filesystem is ext3.
>> Dual core Athlon64 X2 4200 in 64-bit mode.
>> I have never seen such a performance before
>> on a desktop machine.
>>
>
> I'd be willing to bet money that the drive is lying about commits/fsync.
>

It could well be the case.

> Each transaction committed essentially requires one revolution of the
> drive with pg_xlog on it, so a 15kRPM drive limits you to 250TPS.
>

By "revolution", you mean one 360 degrees turnaround of the platter, yes?
On the other hand, if you have multiple clients, isn't the 250 COMMITs/sec
limit is true only per client? Of course assuming that the disk subsystem
has more TCQ/NCQ threads than the actual number of DB clients.

> BTW, PostgreSQL sees a big speed boost if you mount ext3 with the option
> data=writeback. Note that doing that probably has a negative impact on
> data recovery after a crash for non-database files.
>

I haven't touched the FS options.
I can even use ext2 if I want non-recoverability. :-)

--
----------------------------------
Zoltán Böszörményi
Cybertec Geschwinde & Schönig GmbH
http://www.postgresql.at/

Re: Postgres Benchmark Results

От

Zoltan Boszormenyi

Дата:

22 мая 2007 г., 06:27:39

Greg Smith írta:
> On Mon, 21 May 2007, Guido Neitzer wrote:
>
>> Yes, that right, but if a lot of the transactions are selects, there
>> is no entry in the x_log for them and most of the stuff can come from
>> the cache - read from memory which is blazing fast compared to any
>> disk ... And this was a pg_bench test - I don't know what the
>> benchmark really does but if I remember correctly it is mostly reading.
>
> The standard pgbench transaction includes a select, an insert, and
> three updates.  All five finished equals one transaction; the fact
> that the SELECT statment in there could be executed much faster where
> it to happen on its own doesn't matter.
>
> Because it does the most work on the biggest table, the entire
> combination is usually driven mostly by how long the UPDATE to the
> accounts table takes.  The TPS numbers can certainly be no larger than
> the rate at which you can execute that.
>
> As has been pointed out, every time you commit a transacation the disk
> has to actually write that out before it's considered complete.
> Unless you have a good caching disk controller (which your nForce5 is
> not) you're limited to 120 TPS with a 7200RPM drive and 250 with a
> 15000 RPM one. While it's possible to improve slightly on this using
> the commit_delay feature, I haven't been able to replicate even a 100%
> improvement that way when running pgbench, and to get even close to
> that level of improvement would require a large number of clients.
>
> Unless you went out of your way to turn it off, your drive is caching
> writes; every Seagate SATA drive I've ever seen does by default.
> "1062 tps with 3-4 clients" just isn't possible with your hardware
> otherwise. If you turn that feature off with:
>
> hdparm -W0 /dev/hda (might be /dev/sda with the current driver)
>
> that will disable the disk caching and you'll be reporting accurate
> numbers--which will be far lower than you're seeing now.

And AFAIR according to a comment on LKML some time ago,
it greatly decreases your disk's MTBF as well.
But thanks for the great insights, anyway.
I already knew that nForce5 is not a caching controller. :-)
I meant it's a good desktop performer.
And having a good UPS and a bit oversized Enermax PSU
helps avoiding crashes with the sometimes erratic power line.

> While your results are an interesting commentary on how fast the
> system can run when it has a write cache available, and the increase
> with recent code is interesting, your actual figures here are a
> fantasy.  The database isn't working properly and a real system using
> this hardware would be expected to become corrupted if ran for long
> enough.  I have a paper at
> http://www.westnet.com/~gsmith/content/postgresql/TuningPGWAL.htm you
> might want to read that goes into more detail than you probably want
> to know on this subject if you're like to read more about it--and you
> really, really should if you intend to put important data into a
> PostgreSQL database.

Thanks, I will read it.

> --
> * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>


--
----------------------------------
Zoltán Böszörményi
Cybertec Geschwinde & Schönig GmbH
http://www.postgresql.at/

Re: Postgres Benchmark Results

От

Peter Schuller

Дата:

22 мая 2007 г., 06:29:17

>     - Deferred Transactions, since adding a comment to a blog post
> doesn't need the same guarantees than submitting a paid order, it makes
> sense that the application could tell postgres which transactions we
> care about if power is lost. This will massively boost performance for
> websites I believe.

This would be massively useful. Very often all I care about is that the
transaction is semantically committed; that is, that other transactions
starting from that moment will see the modifications done. As opposed to
actually persisting data to disk.

In particular I have a situation where I attempt to utilize available
hardware by using concurrency. The problem is that I have to either
hugely complicate my client code or COMMIT more often than I would like
in order to satisfy dependencies between different transactions. If a
deferred/delayed commit were possible I could get all the performance
benefit without the code complexity, and with no penalty (because in
this case persistence is not important).

--
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@infidyne.com>'
Key retrieval: Send an E-Mail to getpgpkey@scode.org
E-Mail: peter.schuller@infidyne.com Web: http://www.scode.org

Вложения

signature.asc

Re: Postgres Benchmark Results

От

Gregory Stark

Дата:

22 мая 2007 г., 08:04:12

"Alvaro Herrera" <alvherre@commandprompt.com> writes:

> Scott Marlowe wrote:
>>
>> I thought you were limited to 250 or so COMMITS to disk per second, and
>> since >1 client can be committed at once, you could do greater than 250
>> tps, as long as you had >1 client providing input.  Or was I wrong?
>
> My impression is that you are correct in theory -- this is the "commit
> delay" feature.  But it seems that the feature does not work as well as
> one would like; and furthermore, it is disabled by default.

Even without commit delay a client will commit any pending WAL records when it
syncs the WAL. The clients waiting to commit their records will find it
already synced when they get woken up.

However as mentioned a while back in practice it doesn't work quite right and
you should expect to get 1/2 the expected performance. So even with 10 clients
you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a
15kprm drive.

Heikki posted a patch that experimented with fixing this. Hopefully it'll be
fixed for 8.4.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Postgres Benchmark Results

От

Gregory Stark

Дата:

22 мая 2007 г., 08:17:11

What's interesting here is that on a couple metrics the green curve is
actually *better* until it takes that nosedive at 500 MB. Obviously it's not
better on average hits/s, the most obvious metric. But on deviation and
worst-case hits/s it's actually doing better.

Note that while the average hits/s between 100 and 500 is over 600 tps for
Postgres there is a consistent smattering of plot points spread all the way
down to 200 tps, well below the 400-500 tps that MySQL is getting.

Some of those are undoubtedly caused by things like checkpoints and vacuum
runs. Hopefully the improvements that are already in the pipeline will reduce
them.

I mention this only to try to move some of the focus from the average
performance to trying to remove the pitfalls that affact 1-10% of transactions
and screw the worst-case performance. In practical terms it's the worst-case
that governs perceptions, not average case.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Postgres Benchmark Results

От

PFC

Дата:

22 мая 2007 г., 10:07:47

> Note that while the average hits/s between 100 and 500 is over 600 tps
> for
> Postgres there is a consistent smattering of plot points spread all the
> way
> down to 200 tps, well below the 400-500 tps that MySQL is getting.

    Yes, these are due to checkpointing, mostly.
    Also, note that a real forum would not insert 100 posts/s, so it would
not feel this effect. But in order to finish the benchmark in a correct
amount of time, we have to push on the inserts.

> Some of those are undoubtedly caused by things like checkpoints and
> vacuum
> runs. Hopefully the improvements that are already in the pipeline will
> reduce
> them.

    I am re-running it with other tuning, notably cost-based vacuum delay and
less frequent checkpoints, and it is a *lot* smoother.
    These take a full night to run, so I'll post more results when I have
usefull stuff to show.
    This has proven to be a very interesting trip to benchmarkland...

Re: Postgres Benchmark Results

От

Greg Smith

Дата:

23 мая 2007 г., 03:48:49

On Tue, 22 May 2007, Gregory Stark wrote:

> However as mentioned a while back in practice it doesn't work quite right and
> you should expect to get 1/2 the expected performance. So even with 10 clients
> you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a
> 15kprm drive.

I would agree that's the approximate size of the upper-bound.  There are
so many factors that go into the effectiveness of commit_delay that I
wouldn't word it so strongly as to say you can "expect" that much benefit.
The exact delay amount (which can be hard to set if your client load
varies greatly), size of the transactions, balance of seek-bound reads vs.
memory based ones in the transactions, serialization in the transaction
stream, and so many other things can slow the effective benefit.

Also, there are generally other performance issues in the types of systems
you would think would get the most benefit from this parameter that end up
slowing things down anyway.  I've been seeing a best case of closer to
2*single tps rather than 5* on my single-drive systems with no write
caching, but I'll admit I haven't done an exhausting look at it yet (too
busy with the real systems that have good controllers).  One of these
days...

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

Re: Postgres Benchmark Results

От

Gregory Stark

Дата:

23 мая 2007 г., 08:31:38

"Greg Smith" <gsmith@gregsmith.com> writes:

> On Tue, 22 May 2007, Gregory Stark wrote:
>
>> However as mentioned a while back in practice it doesn't work quite right and
>> you should expect to get 1/2 the expected performance. So even with 10 clients
>> you should expect to see 5*120 tps on a 7200 rpm drive and 5*250 tps on a
>> 15kprm drive.
>
> I would agree that's the approximate size of the upper-bound.  There are so
> many factors that go into the effectiveness of commit_delay that I wouldn't
> word it so strongly as to say you can "expect" that much benefit. The exact
> delay amount (which can be hard to set if your client load varies greatly),
> size of the transactions, balance of seek-bound reads vs. memory based ones in
> the transactions, serialization in the transaction stream, and so many other
> things can slow the effective benefit.

This is without commit_delay set at all. Just the regular WAL sync behaviour.

> Also, there are generally other performance issues in the types of systems you
> would think would get the most benefit from this parameter that end up slowing
> things down anyway.  I've been seeing a best case of closer to 2*single tps
> rather than 5* on my single-drive systems with no write caching, but I'll admit
> I haven't done an exhausting look at it yet (too busy with the real systems
> that have good controllers).  One of these days...

Certainly there can be other bottlenecks you reach before WAL fsyncs become
your limiting factor. If your transactions are reading significant amounts of
data you'll be limited by i/o from your data drives. If your data is on the
same drive as your WAL your seek times will be higher than the rotational
latency too.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Postgres Benchmark Results

От

Chris

Дата:

28 мая 2007 г., 03:52:10

>     I am re-running it with other tuning, notably cost-based vacuum
> delay and less frequent checkpoints, and it is a *lot* smoother.
>     These take a full night to run, so I'll post more results when I
> have usefull stuff to show.
>     This has proven to be a very interesting trip to benchmarkland...

[ rather late in my reply but I had to ]

Are you tuning mysql in a similar fashion ?

--
Postgresql & php tutorials
http://www.designmagick.com/

Re: Postgres Benchmark Results

От

PFC

Дата:

28 мая 2007 г., 06:39:25

On Mon, 28 May 2007 05:53:16 +0200, Chris <dmagick@gmail.com> wrote:

>
>>     I am re-running it with other tuning, notably cost-based vacuum
>> delay and less frequent checkpoints, and it is a *lot* smoother.
>>     These take a full night to run, so I'll post more results when I
>> have usefull stuff to show.
>>     This has proven to be a very interesting trip to benchmarkland...
>
> [ rather late in my reply but I had to ]
>
> Are you tuning mysql in a similar fashion ?

    Well, the tuning knobs are different, there are no check points or
vacuum... but yes I tried to tune MySQL too, but the hardest part was
simply making it work without deadlocking continuously.

Re: Postgres Benchmark Results

От

Josh Berkus

Дата:

02 июня 2007 г., 19:45:30

PFC,

Thanks for doing those graphs.  They've been used by Simon &Heikki, and
now me, to show our main issue with PostgreSQL performance: consistency.
  That is, our median response time beats MySQL and even Oracle, but our
bottom 10% does not, and is in fact intolerably bad.

If you want us to credit you by your real name, let us know what it is.

Thanks!

--Josh Berkus

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Postgres Benchmark Results

Вложения