Обсуждение: Changing pg_type records
RETURNS text AS
$BODY$ UPDATE pg_type SET typoutput='c_textout' WHERE typname='SET';
LANGUAGE 'sql' VOLATILE;
ALTER FUNCTION c_mode() OWNER TO postgres;
RETURNS text AS
$BODY$ UPDATE pg_type SET typoutput='textout' WHERE typname='SET';
LANGUAGE 'sql' VOLATILE;
ALTER FUNCTION pg_mode() OWNER TO postgres;
"Rob Richardson" <Rob.Richardson@rad-con.com> writes: > I just discovered the existence of a couple of functions I don't > understand in one customer's PostgreSQL database: What do the substitute output functions do differently? > UPDATE pg_type SET typoutput='c_textout' WHERE typname='SET'; This one has been a no-op since 7.4, because there is no longer any such row in pg_type; and I'm pretty sure the row was vestigial long before that. If the underlying code is old enough that it was designed when changing that row did something interesting, it certainly all needs a fresh look anyway... > Are these functions really as innocuous as he claims? It seems to me > that between the time c_mode() is called and the time that pg_mode() is > called, any other database access is likely to be confused. Yes, you definitely are taking big risks there. Aside from the question of how data appears to the client, this could bollix the behavior of most PL languages too. regards, tom lane
Hi. I'm writing up a database comparison paper in my department at work, with Postgres being a major candidate. I have been attempting to research various issues and provide a meaningful comparison. One issue I would like to give some kind of information on is comparative performance. When I look for such things in Google or Wikipedia or the pgsql email archives, it's hard to find anything reasonably definitive. I've found isolated claims and anecdotes here and there, and a fellow on the list here who attempted to do a comparison between Postgres, MySQL, and Oracle but gave it up for now. Some of the claims I've seen said that in some cases MySQL with MyISAM ran 2x faster than Postgres, but that may have been for a special case with only read access to the database; whereas another one claimed that MySQL with InnoDB was slower than Postgres. Other people commented that it depends on how you tune the databases. Maybe there's nothing definitive out there. However I'd like to get a ballpark idea of how some databases compare, using some kind of average case schema and application, in terms of transactions per second, on a common hardware platform. I would like to be able to point to a reasonable reference, rather than engaging in handwaving myself. Does anyone know where I could look? -Will
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Will Rutherdale (rutherw) > Sent: Thursday, March 19, 2009 1:32 PM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Is there a meaningful benchmark? > > Hi. I'm writing up a database comparison paper in my department at > work, with Postgres being a major candidate. I have been attempting to > research various issues and provide a meaningful comparison. > > One issue I would like to give some kind of information on is > comparative performance. When I look for such things in Google or > Wikipedia or the pgsql email archives, it's hard to find anything > reasonably definitive. I've found isolated claims and anecdotes here > and there, and a fellow on the list here who attempted to do a > comparison between Postgres, MySQL, and Oracle but gave it up for now. > > Some of the claims I've seen said that in some cases MySQL with MyISAM > ran 2x faster than Postgres, but that may have been for a special case > with only read access to the database; whereas another one claimed > that > MySQL with InnoDB was slower than Postgres. Other people commented > that > it depends on how you tune the databases. > > Maybe there's nothing definitive out there. However I'd like to get a > ballpark idea of how some databases compare, using some kind of average > case schema and application, in terms of transactions per second, on a > common hardware platform. I would like to be able to point to a > reasonable reference, rather than engaging in handwaving myself. > > Does anyone know where I could look? The only way to get an answer to a question like this is to actually benchmark the application you have in mind. And the answer won't be very good unless you have an expert on each given system install and tune the application. There is a regular benchmark that is run against the PostgreSQL database. I don't remember where to find the graphs. Probably, someone on the list can tell us the location. Here are some benchmark figures: http://tweakers.net/reviews/657/6 http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-2007 0606-00065.html http://www.informationweek.com/news/software/linux/showArticle.jhtml?art icleID=201001901 http://www.kaltenbrunner.cc/blog/index.php?/archives/21-guid.html http://benchw.sourceforge.net/benchw_results_open3.html P.S. PostgreSQL seems to scale pretty well: http://www.computerworld.com/action/article.do?command=viewArticleBasic& taxonomyId=18&articleId=9087918&intsrc=hm_topic My opinion: Most benchmarks are run by someone with an axe to grind. I never believe them. The TPC benchmarks are probably the most trustworthy, because they have to be certified. But a fast TPC/whatever benchmark is no guarantee that *your* application will run fast. So if you want to evaluation several different technologies do your own benchmark. Do your own calculations to find out the total cost of ownership over the lifetime of the project. Examine all the features that are available, and what kind of technical support is possible. Consider the impact of product licensing. What happens if you need to scale up to titanic volume? After you have thought all factors over very carefully, make your choice. If you rely on someone else to do the work for you, it's really begging for trouble. MySQL guys will show you why MySQL is faster. PostgreSQL guys will show you why PostgreSQL is faster. Oracle guys will show you why Oracle is faster. SQL*Server guys will show you why SQL*Server is faster. DB/2 guys will show you why DB/2 is faster. Now, none of them are lying (at least hopefully) but they are experts in their own domain and not in the domain of the other product and they are also going to choose those tight little corners where their product has the biggest advantage. IMO-YMMV. P.S. I'm a PostgreSQL fan and so I am likely to (perhaps unconsciously) favor PostgreSQL in my remarks.
So if it makes your research easier, you can rule MySQL w/MyISAM out as an option.
Hi. I'm writing up a database comparison paper in my department at
work, with Postgres being a major candidate. I have been attempting to
research various issues and provide a meaningful comparison.
One issue I would like to give some kind of information on is
comparative performance. When I look for such things in Google or
Wikipedia or the pgsql email archives, it's hard to find anything
reasonably definitive. I've found isolated claims and anecdotes here
and there, and a fellow on the list here who attempted to do a
comparison between Postgres, MySQL, and Oracle but gave it up for now.
Some of the claims I've seen said that in some cases MySQL with MyISAM
ran 2x faster than Postgres, but that may have been for a special case
with only read access to the database; whereas another one claimed that
MySQL with InnoDB was slower than Postgres. Other people commented that
it depends on how you tune the databases.
Maybe there's nothing definitive out there. However I'd like to get a
ballpark idea of how some databases compare, using some kind of average
case schema and application, in terms of transactions per second, on a
common hardware platform. I would like to be able to point to a
reasonable reference, rather than engaging in handwaving myself.
Does anyone know where I could look?
-Will
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
- John L Cheng
On Thu, Mar 19, 2009 at 3:16 PM, Dann Corbit <DCorbit@connx.com> wrote: > Here are some benchmark figures: > http://tweakers.net/reviews/657/6 SNIP > My opinion: > Most benchmarks are run by someone with an axe to grind. I never > believe them. Generally I agree with that sentiment. However, it's interesting that the guys at tweakers.net run (or ran?) their chat board on mysql, and expected it to win. They certainly didn't expect the kind of stomping that postgresql gave to mysql, so I look at that benchmark as a "statement against interest" kind of thing. They were genuinely surprised that postgresql didn't just win, but behaved much much better than mysql under heavier loads.
It isn't actually possible at this stage for me to benchmark "the application" because it doesn't yet exist. There are a number of potential projects floating around, with as yet unwritten specifications, to run on different platforms ranging from embedded to larger servers. People just want to hear what I think is a good RDBMS to use. My opinion won't necessarily be followed. Nobody at this point is expecting the RDBMS to become a bottleneck, if they are planning to actually use one at all. However someone is sure to ask the question, for an average application with an average database, how is performance? Even if such a question is answered, it isn't going to be the only factor. For example I have collected reasonable numbers already on footprints of different RDBMSs, because embedded guys might find that important if they're restricted to 64MB of flash. On the other hand if they went with some of the newer solid state drives with gigs of space, then a few packages using 10s of MB wouldn't be such a problem any more. In short, all bets are off and I'm just looking for baseline information. This is just a general feasibility and technology exploration phase. I'm aware of the limitations of hard numbers, but the more simple information I have in different dimensions, the easier it is to convince people not to lock in too early. Thanks for the info, I'll check some of those references. -Will -----Original Message----- From: Dann Corbit [mailto:DCorbit@connx.com] Sent: 19 March 2009 17:16 To: Will Rutherdale (rutherw); pgsql-general@postgresql.org Subject: RE: [GENERAL] Is there a meaningful benchmark? The only way to get an answer to a question like this is to actually benchmark the application you have in mind. And the answer won't be very good unless you have an expert on each given system install and tune the application. There is a regular benchmark that is run against the PostgreSQL database. I don't remember where to find the graphs. Probably, someone on the list can tell us the location. Here are some benchmark figures: http://tweakers.net/reviews/657/6 http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-2007 0606-00065.html http://www.informationweek.com/news/software/linux/showArticle.jhtml?art icleID=201001901 http://www.kaltenbrunner.cc/blog/index.php?/archives/21-guid.html http://benchw.sourceforge.net/benchw_results_open3.html P.S. PostgreSQL seems to scale pretty well: http://www.computerworld.com/action/article.do?command=viewArticleBasic& taxonomyId=18&articleId=9087918&intsrc=hm_topic My opinion: Most benchmarks are run by someone with an axe to grind. I never believe them. The TPC benchmarks are probably the most trustworthy, because they have to be certified. But a fast TPC/whatever benchmark is no guarantee that *your* application will run fast. So if you want to evaluation several different technologies do your own benchmark. Do your own calculations to find out the total cost of ownership over the lifetime of the project. Examine all the features that are available, and what kind of technical support is possible. Consider the impact of product licensing. What happens if you need to scale up to titanic volume? After you have thought all factors over very carefully, make your choice. If you rely on someone else to do the work for you, it's really begging for trouble. MySQL guys will show you why MySQL is faster. PostgreSQL guys will show you why PostgreSQL is faster. Oracle guys will show you why Oracle is faster. SQL*Server guys will show you why SQL*Server is faster. DB/2 guys will show you why DB/2 is faster. Now, none of them are lying (at least hopefully) but they are experts in their own domain and not in the domain of the other product and they are also going to choose those tight little corners where their product has the biggest advantage. IMO-YMMV. P.S. I'm a PostgreSQL fan and so I am likely to (perhaps unconsciously) favor PostgreSQL in my remarks.
On Thu, Mar 19, 2009 at 3:50 PM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > Even if such a question is answered, it isn't going to be the only > factor. For example I have collected reasonable numbers already on > footprints of different RDBMSs, because embedded guys might find that > important if they're restricted to 64MB of flash. On the other hand if > they went with some of the newer solid state drives with gigs of space, > then a few packages using 10s of MB wouldn't be such a problem any more. If you're looking at embedded usage, and footprint is an issue (it usually is even if you think it won't be) look at sqllite. Pretty good embedded db and lightweight. Pgsql is not intended to compete in the embedded space.
I am *not* primarily interested in embedded, but I know people who are, and I have already compared with SQLite. My main point of concern right now is for more middle sized platforms (such as an average workstation), to be able to answerthe question of how Postgres shows in transactions per second against another RDBMS or two. -Will -----Original Message----- From: Scott Marlowe [mailto:scott.marlowe@gmail.com] Sent: 19 March 2009 17:57 To: Will Rutherdale (rutherw) Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Is there a meaningful benchmark? On Thu, Mar 19, 2009 at 3:50 PM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > Even if such a question is answered, it isn't going to be the only > factor. For example I have collected reasonable numbers already on > footprints of different RDBMSs, because embedded guys might find that > important if they're restricted to 64MB of flash. On the other hand if > they went with some of the newer solid state drives with gigs of space, > then a few packages using 10s of MB wouldn't be such a problem any more. If you're looking at embedded usage, and footprint is an issue (it usually is even if you think it won't be) look at sqllite. Pretty good embedded db and lightweight. Pgsql is not intended to compete in the embedded space.
> -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general- > owner@postgresql.org] On Behalf Of Scott Marlowe > Sent: Thursday, March 19, 2009 2:57 PM > To: Will Rutherdale (rutherw) > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Is there a meaningful benchmark? > > On Thu, Mar 19, 2009 at 3:50 PM, Will Rutherdale (rutherw) > <rutherw@cisco.com> wrote: > > Even if such a question is answered, it isn't going to be the only > > factor. For example I have collected reasonable numbers already on > > footprints of different RDBMSs, because embedded guys might find that > > important if they're restricted to 64MB of flash. On the other hand > if > > they went with some of the newer solid state drives with gigs of > space, > > then a few packages using 10s of MB wouldn't be such a problem any > more. > > If you're looking at embedded usage, and footprint is an issue (it > usually is even if you think it won't be) look at sqllite. Pretty > good embedded db and lightweight. Pgsql is not intended to compete in > the embedded space. FastDB is another good option there (it's a portable, embedded memory mapped database): http://www.garret.ru/fastdb.html An advantage for the SQLite option is that the grammar is a subset of PostgreSQL grammar, so if you need to scale up, youhave a ready path.
I'm having trouble with the tweakers reference below. I was hoping to see something where hardware platform is held constant while RDBMS is varied, but it seems to be just theopposite. Or maybe I didn't read the article the right way. -Will -----Original Message----- From: Scott Marlowe [mailto:scott.marlowe@gmail.com] Sent: 19 March 2009 17:36 To: Dann Corbit Cc: Will Rutherdale (rutherw); pgsql-general@postgresql.org Subject: Re: [GENERAL] Is there a meaningful benchmark? On Thu, Mar 19, 2009 at 3:16 PM, Dann Corbit <DCorbit@connx.com> wrote: > Here are some benchmark figures: > http://tweakers.net/reviews/657/6 SNIP > My opinion: > Most benchmarks are run by someone with an axe to grind. I never > believe them. Generally I agree with that sentiment. However, it's interesting that the guys at tweakers.net run (or ran?) their chat board on mysql, and expected it to win. They certainly didn't expect the kind of stomping that postgresql gave to mysql, so I look at that benchmark as a "statement against interest" kind of thing. They were genuinely surprised that postgresql didn't just win, but behaved much much better than mysql under heavier loads.
On Thu, Mar 19, 2009 at 4:02 PM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > I am *not* primarily interested in embedded, but I know people who are, and I have already compared with SQLite. > > My main point of concern right now is for more middle sized platforms (such as an average workstation), to be able to answerthe question of how Postgres shows in transactions per second against another RDBMS or two. > Way back right after the earth's crust had cooled I deployed a corporate intranet on pgsql 6.5.2. I believe we had Wooly Mammoth for dinner that night. It took a fair bit of work to keep that machine happy, and every new version was an eye opener in terms of performance, reliability, and capability improvements. By the time 8.0 came out I was more than prepared to use it for some pretty hefty work. Now that 8.3 is out and 8.4 is out, for any kind of intermediate size application (thousands of users daily, hundreds of gigs of data) I'll put PostgreSQL against any other DBMS and expect it to do well. Where I work now, we handle 1.5 million or so users, a large chunk of which log in each day, several times a day. We had to beef up our servers to 8 core / 16 drive machines (two of them) to handle the load reliably. We have enough spare capacity to handle about 3 to 4 times the number of users we now have. You are far more likely to be bitten by lack of familiarity with ANY db you choose, and the real issue will be support and training. Oracle, Pgsql, DB2, Interbase / Firebird, MySQL w/ innodb can all handle the kind of load you've kind of hand waved about here. How much money and time you'll spend setting them up, learning them, supporting them, and using them will be far more important than anything a benchmark is likely to tell you right now.
I am already aware of this issue, and am preparing to explain it to people.
Having said that, if it were possible to set up a reasonably average database, with a test application that hits it with a reasonable mix of select, insert, and update operations, and run it one at a time against different RDBMSs on the same machine, then it might yield some simple numbers that could be quoted to people in case they asked.
The goal is not to absolutely determine which is fastest in the made-up scenario, I don’t think anyone cares. However it would be interesting to see if the different RDBMSs came in within a reasonable percentage of each other.
An analogy would be BogoMIPS. Nobody takes it that seriously because they know there are numerous factors that affect how a machine runs under different applications. But as a quick sanity check BogoMIPS can be useful at times.
-Will
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of John Cheng
Sent: 19 March 2009 17:27
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Is there a meaningful benchmark?
Comparison between MySQL using the MyISAM engine with PostgreSQL is really not sensible. For one, the MyISAM engine does not have transaction and foreign key support, while PostgreSQL supports transaction and foreign key. Would anyone really give up transaction and integrity for slightly more performance?
On Thu, Mar 19, 2009 at 3:56 PM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > I'm having trouble with the tweakers reference below. > > I was hoping to see something where hardware platform is held constant while RDBMS is varied, but it seems to be just theopposite. Or maybe I didn't read the article the right way. The tweakers test is actually VERY telling. With a constant load, how well do the database scale as you improve the hardware you've given them to run on. Hardware is cheap, downtime is not. If you can toss a 16 core server at a performance problem for $20,000 or so, that's probably way cheaper than watching your main db chug under load and go down twice a day. Conversely, telling the bossman you need that 16 core server to improve performance and seeing the new server collapse under load faster than the old one due to poor concurrency is not gonna win you a lot of brownie points.
On Thu, Mar 19, 2009 at 4:11 PM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > I am already aware of this issue, and am preparing to explain it to people. Well, keep in mind that MOST people are gonna wave you off, and figure it doesn't matter that much. Lots of developers are pretty cavalier with their user's data. Now, if the data is easily replaceable, then fine. Put a price on the data. Pay attention to that price, and don't underestimate it. It's usually the most expensive thing on a db server. > Having said that, if it were possible to set up a reasonably average > database, with a test application that hits it with a reasonable mix of > select, insert, and update operations, and run it one at a time against > different RDBMSs on the same machine, then it might yield some simple > numbers that could be quoted to people in case they asked. But will you be comparing apples to apples. MySQL plays fast and loose with data integrity (with myisam tables). So at least make sure you're comparing what you need to compare. If data integrity is important, you need to use innodb. If it isn't, then just use flat files. > The goal is not to absolutely determine which is fastest in the made-up > scenario, I don’t think anyone cares. However it would be interesting to > see if the different RDBMSs came in within a reasonable percentage of each > other. But again, if you're comparing a Go kart to a pickup truck, you need to know that's what you're doing. The abstract numbers mean little outside of that fact. > An analogy would be BogoMIPS. Nobody takes it that seriously because they > know there are numerous factors that affect how a machine runs under > different applications. But as a quick sanity check BogoMIPS can be useful > at times. Sorry, but I respectfully disagree. Bogomips has the word bogo in it on purpose. It means, literally, almost nothing useful to the user. Any modern db is fast enough for the kind of low level stuff you've mentioned so far. Until you have a better idea what your application(s) might look like, it's hard to offer any real advice besides "avoid myisam"
On Thu, Mar 19, 2009 at 3:26 PM, John Cheng <chonger.cheng@gmail.com> wrote: > Comparison between MySQL using the MyISAM engine with PostgreSQL is really > not sensible. For one, the MyISAM engine does not have transaction and > foreign key support, while PostgreSQL supports transaction and foreign key. > Would anyone really give up transaction and integrity for slightly more > performance? Sure, if the application fit. If I have to load 100Meg files into a db, run some simple extraction on them, and output it back out, and can recreate all my data at the drop of a hat, then mysql / myisam might be a good match. I am no a fan of MySQL, more because the company behind it seems to be lost and drifting than the db itself. Bugs that are 5 years old finally getting fixed after Monty chided them in december? Come on, PostgreSQL comes out with a near bug free new version every 1 to 2 years. PostgreSQL stomps bugs in hours or days that MySQL AB takes YEARS to fix, and then get reintroduced (see the order by on innodb indexed field debacle for that story) and I just don't trust the company or the db for anything complex. But as a tool it has some uses that it's good enough at I could use it if I had to.
Okay, you've given me some useful information. As the original subject line indicates, I'm open to the idea that no such benchmark exists. If anyone asks about this stuff, I can just say that performance varies widely by database and application, that Postgresperforms well enough against other RDBMSs, that Postgres is known to scale up well and make good use of concurrency,and that I couldn't find any clear benchmark results to back it up. Of course, if I *did* find any benchmark values then I could have used that to dispel false rumours from the MySQL guys. However it looks like simple measured indicators aren't easy to come by. -Will -----Original Message----- From: Scott Marlowe [mailto:scott.marlowe@gmail.com] Sent: 19 March 2009 18:14 To: Will Rutherdale (rutherw) Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Is there a meaningful benchmark? On Thu, Mar 19, 2009 at 3:56 PM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > I'm having trouble with the tweakers reference below. > > I was hoping to see something where hardware platform is held constant while RDBMS is varied, but it seems to be just theopposite. Or maybe I didn't read the article the right way. The tweakers test is actually VERY telling. With a constant load, how well do the database scale as you improve the hardware you've given them to run on. Hardware is cheap, downtime is not. If you can toss a 16 core server at a performance problem for $20,000 or so, that's probably way cheaper than watching your main db chug under load and go down twice a day. Conversely, telling the bossman you need that 16 core server to improve performance and seeing the new server collapse under load faster than the old one due to poor concurrency is not gonna win you a lot of brownie points.
Here is another interesting benchmark with a particular user's application: http://blog.page2rss.com/2007/01/postgresql-vs-mysql-performance.html P.S. Oracle won't let you publish any benchmark numbers. So if you find an Oracle comparison, it's "unauthorized"
On Thu, Mar 19, 2009 at 4:24 PM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > Okay, you've given me some useful information. > > As the original subject line indicates, I'm open to the idea that no such benchmark exists. > > If anyone asks about this stuff, I can just say that performance varies widely by database and application, that Postgresperforms well enough against other RDBMSs, that Postgres is known to scale up well and make good use of concurrency,and that I couldn't find any clear benchmark results to back it up. > > Of course, if I *did* find any benchmark values then I could have used that to dispel false rumours from the MySQL guys. However it looks like simple measured indicators aren't easy to come by. Well, the tweakers benchmarks are a pretty good mysql pgsql comparison, although getting older now. Apparently mysql has improved a lot of their concurrency issues that were uncovered in that one. Pgsql has just continued to get faster. Google MySQL gotchas for a list of reasons to avoid it. There's a pgsql gotcha list, it's shorter, and the gotchas are a lot less likely to induce a case of the vapors.
This looks similar to things I've seen before. MyISAM can be made to look twice as fast as Postgres if the application is cooked to throw away transaction processing, updates, and referential integrity, none of which MyISAM seems to support well. I plan to make a point of this to people, as I personally have experience working with RDBMSs in the past and understand the importance of these capabilities. However not everyone I talk to will have any experiences with databases and understand the issues. That's why I was looking for a more balanced benchmark that exercises said capabilities. -Will -----Original Message----- From: Dann Corbit [mailto:DCorbit@connx.com] Sent: 19 March 2009 18:26 To: Scott Marlowe; Will Rutherdale (rutherw) Cc: pgsql-general@postgresql.org Subject: RE: [GENERAL] Is there a meaningful benchmark? Here is another interesting benchmark with a particular user's application: http://blog.page2rss.com/2007/01/postgresql-vs-mysql-performance.html
On Mar 19, 2009, at 4:50 PM, Will Rutherdale (rutherw) wrote: > It isn't actually possible at this stage for me to benchmark "the > application" because it doesn't yet exist. There are a number of > potential projects floating around, with as yet unwritten > specifications, to run on different platforms ranging from embedded to > larger servers. People just want to hear what I think is a good RDBMS > to use. My opinion won't necessarily be followed. If you name embedded... and you are even thinking of using MySQl, don't forget you need to sell a MySQL license for each product you sell to a customer. First know what your requirements are, then strip away the systems that don't comply with your requirements. Then think of what sort of transactions you require.. then check what RDBM might be good for you... for the couple that are left over do the tests... Ries > > Nobody at this point is expecting the RDBMS to become a bottleneck, if > they are planning to actually use one at all. However someone is sure > to ask the question, for an average application with an average > database, how is performance? > > Even if such a question is answered, it isn't going to be the only > factor. For example I have collected reasonable numbers already on > footprints of different RDBMSs, because embedded guys might find that > important if they're restricted to 64MB of flash. On the other hand > if > they went with some of the newer solid state drives with gigs of > space, > then a few packages using 10s of MB wouldn't be such a problem any > more. > > In short, all bets are off and I'm just looking for baseline > information. This is just a general feasibility and technology > exploration phase. > > I'm aware of the limitations of hard numbers, but the more simple > information I have in different dimensions, the easier it is to > convince > people not to lock in too early. > > Thanks for the info, I'll check some of those references. > > -Will > > > -----Original Message----- > From: Dann Corbit [mailto:DCorbit@connx.com] > Sent: 19 March 2009 17:16 > To: Will Rutherdale (rutherw); pgsql-general@postgresql.org > Subject: RE: [GENERAL] Is there a meaningful benchmark? > > The only way to get an answer to a question like this is to actually > benchmark the application you have in mind. > And the answer won't be very good unless you have an expert on each > given system install and tune the application. > > There is a regular benchmark that is run against the PostgreSQL > database. I don't remember where to find the graphs. Probably, > someone > on the list can tell us the location. > > > Here are some benchmark figures: > http://tweakers.net/reviews/657/6 > http://www.spec.org/jAppServer2004/results/res2007q3/jAppServer2004-2007 > 0606-00065.html > http://www.informationweek.com/news/software/linux/showArticle.jhtml?art > icleID=201001901 > http://www.kaltenbrunner.cc/blog/index.php?/archives/21-guid.html > http://benchw.sourceforge.net/benchw_results_open3.html > > > P.S. > PostgreSQL seems to scale pretty well: > http://www.computerworld.com/action/article.do?command=viewArticleBasic& > taxonomyId=18&articleId=9087918&intsrc=hm_topic > > > My opinion: > Most benchmarks are run by someone with an axe to grind. I never > believe them. The TPC benchmarks are probably the most trustworthy, > because they have to be certified. But a fast TPC/whatever > benchmark is > no guarantee that *your* application will run fast. So if you want to > evaluation several different technologies do your own benchmark. Do > your own calculations to find out the total cost of ownership over the > lifetime of the project. Examine all the features that are available, > and what kind of technical support is possible. Consider the impact > of > product licensing. What happens if you need to scale up to titanic > volume? After you have thought all factors over very carefully, make > your choice. > > If you rely on someone else to do the work for you, it's really > begging > for trouble. MySQL guys will show you why MySQL is faster. > PostgreSQL > guys will show you why PostgreSQL is faster. Oracle guys will show > you > why Oracle is faster. SQL*Server guys will show you why SQL*Server is > faster. DB/2 guys will show you why DB/2 is faster. Now, none of > them > are lying (at least hopefully) but they are experts in their own > domain > and not in the domain of the other product and they are also going to > choose those tight little corners where their product has the biggest > advantage. > > IMO-YMMV. > P.S. > I'm a PostgreSQL fan and so I am likely to (perhaps unconsciously) > favor > PostgreSQL in my remarks. > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general regards, Ries van Twisk ------------------------------------------------------------------------------------------------- Ries van Twisk tags: Freelance TYPO3 Glassfish JasperReports JasperETL Flex Blaze-DS WebORB PostgreSQL DB-Architect email: ries@vantwisk.nl web: http://www.rvantwisk.nl/ skype: callto://r.vantwisk Phone: +1-810-476-4196 SIP: +1-747-690-5133
Вложения
On Thu, Mar 19, 2009 at 3:26 PM, John Cheng <chonger.cheng@gmail.com> wrote:Sure, if the application fit. If I have to load 100Meg files into a
> Comparison between MySQL using the MyISAM engine with PostgreSQL is really
> not sensible. For one, the MyISAM engine does not have transaction and
> foreign key support, while PostgreSQL supports transaction and foreign key.
> Would anyone really give up transaction and integrity for slightly more
> performance?
db, run some simple extraction on them, and output it back out, and
can recreate all my data at the drop of a hat, then mysql / myisam
might be a good match.
Right . There will be situations where MySQL with MyISAM will be a good fit. My point was more that it doesn't make sense to simply compare "speed" becuase other things needs to be taken into account.
I am no a fan of MySQL, more because the company behind it seems to be
lost and drifting than the db itself. Bugs that are 5 years old
finally getting fixed after Monty chided them in december? Come on,
PostgreSQL comes out with a near bug free new version every 1 to 2
years. PostgreSQL stomps bugs in hours or days that MySQL AB takes
YEARS to fix, and then get reintroduced (see the order by on innodb
indexed field debacle for that story) and I just don't trust the
company or the db for anything complex. But as a tool it has some
uses that it's good enough at I could use it if I had to.
--
- John L Cheng
Dann Corbit wrote: > > Here is another interesting benchmark with a particular user's > application: > http://blog.page2rss.com/2007/01/postgresql-vs-mysql-performance.html > > P.S. > Oracle won't let you publish any benchmark numbers. > So if you find an Oracle comparison, it's "unauthorized" True enough. That said I feel comfortable in revealing that a former employer of mine ran some serious tests of PostgreSQL vs Oracle (with an emphasis of postGIS and Oracle's equivalent) about a year and a half ago. Oracle was consistently 5-15% faster depending on the precise benchmark. This was judged to be not worth the extra moneyfor more Oracle licenses. In some other environments that edge might be worth the money. Oracle does spend resourceson its products and so I don't find the speed difference surprising. But when you consider the speed with whichOracle produces patches vs. the Postgres folks the winner is clearly the latter. Greg Williamson
> That's why I was looking for a more balanced benchmark that exercises > said capabilities. OK, here's the thing, I will give you *one* sample issue to think about, as an illustration of the kinds of differences there are. - PostgresQL uses MVCC instead of row or page locking, which means only writer vs writer locks, which means in many situations less contention and better throughput as your concurrent load goes up. - But it also means slower count(*) and no covering indexes. (Keeping multiple versions of index pages would be too slow for many other operations, so PG has go to the table pages to check which rows are actually visible in the current transaction.) See? Overall, PG performs very well and is certainly comparable to the "big boys" on the same hardware. But the strengths & weaknesses can be *very* specific to particular queries. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
You have a point, as do a lot of the other folks. However, keeping the KISS principle in mind, you can create a benchmark that simply sets up a sample database and forks off a bunch of processes to do random updates for an hour, say. Dead simple. In fact, it's so simple that I've already written the code and have it running against Postgres now. A Perl DBI script runs in a loop updating, and later prints out the number of transactions it completed in the given time frame. At the end I just tally up the numbers and I have the Will Rutherdale benchmark number for Postgres. It will give me a simple number in units of transactions per second. When I get time I'll set up an equivalent MySQL database on the same machine and run it against that. I have 'top' running in one screen and can see all the copies of my script and all the copies of postmaster and the CPU they're using. The degree to which I got excoriated just for asking the question convinced me that I was onto something and had to do it myself. -Will -----Original Message----- From: Scott Ribe [mailto:scott_ribe@killerbytes.com] Sent: 20 March 2009 11:27 To: Will Rutherdale (rutherw); pgsql-general@postgresql.org Subject: Re: [GENERAL] Is there a meaningful benchmark? > That's why I was looking for a more balanced benchmark that exercises > said capabilities. OK, here's the thing, I will give you *one* sample issue to think about, as an illustration of the kinds of differences there are. - PostgresQL uses MVCC instead of row or page locking, which means only writer vs writer locks, which means in many situations less contention and better throughput as your concurrent load goes up. - But it also means slower count(*) and no covering indexes. (Keeping multiple versions of index pages would be too slow for many other operations, so PG has go to the table pages to check which rows are actually visible in the current transaction.) See? Overall, PG performs very well and is certainly comparable to the "big boys" on the same hardware. But the strengths & weaknesses can be *very* specific to particular queries. -- Scott Ribe scott_ribe@killerbytes.com http://www.killerbytes.com/ (303) 722-0567 voice
On Fri, Mar 20, 2009 at 9:55 AM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > You have a point, as do a lot of the other folks. > > However, keeping the KISS principle in mind, you can create a benchmark > that simply sets up a sample database and forks off a bunch of processes > to do random updates for an hour, say. Dead simple. > > In fact, it's so simple that I've already written the code and have it > running against Postgres now. A Perl DBI script runs in a loop > updating, and later prints out the number of transactions it completed > in the given time frame. At the end I just tally up the numbers and I > have the Will Rutherdale benchmark number for Postgres. It will give me > a simple number in units of transactions per second. Just keep in mind that a single thread updating the database is not a very realistic benchmark. Databases tend to not get interesting until there are dozens to hundreds of threads running against it at the same time.
"Will Rutherdale (rutherw)" <rutherw@cisco.com> writes: > However, keeping the KISS principle in mind, you can create a benchmark > that simply sets up a sample database and forks off a bunch of processes > to do random updates for an hour, say. Dead simple. Indeed, and more than likely dead useless. The only benchmark that really counts is one's live application, which is probably not update-only and probably has a fairly non-random update pattern too. What people have been trying to point out to you is that you can certainly measure *something* with a benchmark test that has no thought behind it, but it's not clear whether the numbers you come up with will have any real-world value. regards, tom lane
On Thu, 19 Mar 2009, Dann Corbit wrote: > Oracle won't let you publish any benchmark numbers. > So if you find an Oracle comparison, it's "unauthorized" You can find some useful comparisons that include Oracle if you look at the audited benchmarks from the TPC. I've collected links to a bunch at http://wiki.postgresql.org/wiki/Why_PostgreSQL_Instead_of_MySQL:_Comparing_Reliability_and_Speed_in_2007#Sun_Microsystems_2007_jAppServer2004_Benchmark_Results -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Fri, 20 Mar 2009, Will Rutherdale (rutherw) wrote: > However, keeping the KISS principle in mind, you can create a benchmark > that simply sets up a sample database and forks off a bunch of processes > to do random updates for an hour, say. Dead simple. There's a benchmark tool that does something like this that comes with PostgreSQL named pgbench. A MySQL-oriented tool named sysbench also can do that, and it supports running against PostgreSQL as well--badly though, so it's hard to use that to do a fair comparison. Simple benchmarks tend to measure only one thing though, and it's often not what you think you're measuring. For example, pgbench produces a transactions/per second number. It's useful for comparing the relative performance between two PostgreSQL instances, and people think it gives you an idea of transactional performance. What the actual magnitude of the result measures in many cases is instead how well the generated data set fits in cache. If you're doing something update heavy, a lot of the time what you actually will measure is how fast your disk can seek, process a disk commit done using fsync, or some combination of the two. If you're not careful to make sure you're using the same level of disk commit guarantee on both installations, it's real easy to get bad benchmark results here. The intro to that subject from the PostgreSQL perspective is at http://www.postgresql.org/docs/8.3/static/wal-reliability.html On MySQL, the parameters that controls this behavior are described starting at http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit For something with lots of disk commits, it's critical that you have both systems configured identically here. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
http://blog.redfin.com/devblog/2007/11/elephant_versus_dolphin_which_is_faster_which_is_smarter.html Here's another PgSQL/MySQL testimonial, with a spatial twist :) P On Fri, Mar 20, 2009 at 10:15 AM, Greg Smith <gsmith@gregsmith.com> wrote: > On Fri, 20 Mar 2009, Will Rutherdale (rutherw) wrote: > >> However, keeping the KISS principle in mind, you can create a benchmark >> that simply sets up a sample database and forks off a bunch of processes >> to do random updates for an hour, say. Dead simple. > > There's a benchmark tool that does something like this that comes with > PostgreSQL named pgbench. A MySQL-oriented tool named sysbench also can do > that, and it supports running against PostgreSQL as well--badly though, so > it's hard to use that to do a fair comparison. > > Simple benchmarks tend to measure only one thing though, and it's often not > what you think you're measuring. For example, pgbench produces a > transactions/per second number. It's useful for comparing the relative > performance between two PostgreSQL instances, and people think it gives you > an idea of transactional performance. What the actual magnitude of the > result measures in many cases is instead how well the generated data set > fits in cache. > > If you're doing something update heavy, a lot of the time what you actually > will measure is how fast your disk can seek, process a disk commit done > using fsync, or some combination of the two. If you're not careful to make > sure you're using the same level of disk commit guarantee on both > installations, it's real easy to get bad benchmark results here. The intro > to that subject from the PostgreSQL perspective is at > http://www.postgresql.org/docs/8.3/static/wal-reliability.html > > On MySQL, the parameters that controls this behavior are described starting > at > http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit > > For something with lots of disk commits, it's critical that you have both > systems configured identically here. > > -- > * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Point taken. Thank you for the help. -Will -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: 20 March 2009 12:06 To: Will Rutherdale (rutherw) Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Is there a meaningful benchmark? "Will Rutherdale (rutherw)" <rutherw@cisco.com> writes: > However, keeping the KISS principle in mind, you can create a benchmark > that simply sets up a sample database and forks off a bunch of processes > to do random updates for an hour, say. Dead simple. Indeed, and more than likely dead useless. The only benchmark that really counts is one's live application, which is probably not update-only and probably has a fairly non-random update pattern too. What people have been trying to point out to you is that you can certainly measure *something* with a benchmark test that has no thought behind it, but it's not clear whether the numbers you come up with will have any real-world value. regards, tom lane
Thanks for the references. -Will -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Paul Ramsey Sent: 20 March 2009 13:29 To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Is there a meaningful benchmark? http://blog.redfin.com/devblog/2007/11/elephant_versus_dolphin_which_is_faster_which_is_smarter.html Here's another PgSQL/MySQL testimonial, with a spatial twist :) P On Fri, Mar 20, 2009 at 10:15 AM, Greg Smith <gsmith@gregsmith.com> wrote: > On Fri, 20 Mar 2009, Will Rutherdale (rutherw) wrote: > >> However, keeping the KISS principle in mind, you can create a benchmark >> that simply sets up a sample database and forks off a bunch of processes >> to do random updates for an hour, say. Dead simple. > > There's a benchmark tool that does something like this that comes with > PostgreSQL named pgbench. A MySQL-oriented tool named sysbench also can do > that, and it supports running against PostgreSQL as well--badly though, so > it's hard to use that to do a fair comparison. > > Simple benchmarks tend to measure only one thing though, and it's often not > what you think you're measuring. For example, pgbench produces a > transactions/per second number. It's useful for comparing the relative > performance between two PostgreSQL instances, and people think it gives you > an idea of transactional performance. What the actual magnitude of the > result measures in many cases is instead how well the generated data set > fits in cache. > > If you're doing something update heavy, a lot of the time what you actually > will measure is how fast your disk can seek, process a disk commit done > using fsync, or some combination of the two. If you're not careful to make > sure you're using the same level of disk commit guarantee on both > installations, it's real easy to get bad benchmark results here. The intro > to that subject from the PostgreSQL perspective is at > http://www.postgresql.org/docs/8.3/static/wal-reliability.html > > On MySQL, the parameters that controls this behavior are described starting > at > http://dev.mysql.com/doc/refman/5.1/en/innodb-parameters.html#sysvar_innodb_flush_log_at_trx_commit > > For something with lots of disk commits, it's critical that you have both > systems configured identically here. > > -- > * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
On Fri, Mar 20, 2009 at 9:27 AM, Scott Ribe <scott_ribe@killerbytes.com> wrote: >> That's why I was looking for a more balanced benchmark that exercises >> said capabilities. > > OK, here's the thing, I will give you *one* sample issue to think about, as > an illustration of the kinds of differences there are. > > - PostgresQL uses MVCC instead of row or page locking, which means only > writer vs writer locks, which means in many situations less contention and > better throughput as your concurrent load goes up. It's also important to point out that writers don't necessarily block other writers. As long as they're operating on different ranges of the data set. You can have dozens of writers streaming data in with differening primary keys all running together. Then of course, you have the ability to have readers block writers using for update, which turns a reader into a writer-to-(possibly)-be. A lot of older dbs had locking by the page, not by the record, or as with myisam, by the whole table. Page locking lead to records that shared pages locking each other needlessly. Table locking leads to even longer queues forming under heavy write load. This does NOT happen in pgsql if you're updating / inserting independent records. One of the reasons postgres scales so well is it keeps writes 'cheap' in that they don't have to interact with anything other than writes to the same records. If you've got millions of records and thousands being updated, writes not blocking writes combined with record level locking versus page level locking (or worse table level locking), pg can handle pretty high concurrent write loads on the right hardware and still maintain a good throughput on reads. And concurrent write load is what usually cripples a server.
On 26/03/2009 23:10, Scott Marlowe wrote: > It's also important to point out that writers don't necessarily block > other writers. As long as they're operating on different ranges of > the data set. You can have dozens of writers streaming data in with > differening primary keys all running together. Do you have to do anything special to have this happen - e.g. table partitioning? - Or does it just happen automagically based on the primary key? Ray. ------------------------------------------------------------------ Raymond O'Donnell, Director of Music, Galway Cathedral, Ireland rod@iol.ie Galway Cathedral Recitals: http://www.galwaycathedral.org/recitals ------------------------------------------------------------------
On Thu, Mar 26, 2009 at 5:23 PM, Raymond O'Donnell <rod@iol.ie> wrote: > On 26/03/2009 23:10, Scott Marlowe wrote: >> It's also important to point out that writers don't necessarily block >> other writers. As long as they're operating on different ranges of >> the data set. You can have dozens of writers streaming data in with >> differening primary keys all running together. > > Do you have to do anything special to have this happen - e.g. table > partitioning? - Or does it just happen automagically based on the > primary key? No, it's pretty much automatic. Pgsql creates new records for every update or insert, so there's no weird locking on the original records to make it slow. Everything just goes in the WAL and gets flushed out to disk later. Setting up commit siblings helps on some loads. That's about it. No rocket science or triggers really needed for lots of writes at the same time.
On Thu, Mar 26, 2009 at 5:10 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote: > It's also important to point out that writers don't necessarily block > other writers. As long as they're operating on different ranges of > the data set. You can have dozens of writers streaming data in with > differening primary keys all running together. To be fare, some database apps have a few rows they update in a near continuous stream, and they row lock. These databases are often better served by db2 or some other row locking database than pgsql where you may or may not have problems with bloating. There are times picking a database some will turn left when they should have turned right. Picking pgsql for this kind of app is usually that kind of situation. Innodb would have bloat problems I'd assume too. MyIsam's table locking puts it out, and you're left with one of the other databases. I wonder how firebird handles that situation.
I've had lots of experience fighting with the LOAD FROM command with an old application using an old version of Informix. When the data to be loaded is large enough, Informix (dbaccess) hits a wall. Tracking it down with tools generally shows that zillions of locks are accumulating. The system thrashes. So far I haven't found this problem with Postgres. Incidentally, in case people are interested in hearing the follow-up to my original message, my home-cooked benchmark with lots of processes doing tons of updates seemed to initially show MyISAM winning over Postgres. However I spent some time reading what I could from the manual on performance improvement and realized that commits might be an issue. I moved a commit() call out of the main loop, and bang! Postgres showed a substantial performance advantage over MyISAM. -Will -----Original Message----- From: Scott Marlowe [mailto:scott.marlowe@gmail.com] Sent: 27 March 2009 02:25 To: Scott Ribe Cc: Will Rutherdale (rutherw); pgsql-general@postgresql.org Subject: Re: [GENERAL] Is there a meaningful benchmark? To be fare, some database apps have a few rows they update in a near continuous stream, and they row lock. These databases are often better served by db2 or some other row locking database than pgsql where you may or may not have problems with bloating.
On Fri, Mar 27, 2009 at 10:32 AM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > > Incidentally, in case people are interested in hearing the follow-up to > my original message, my home-cooked benchmark with lots of processes > doing tons of updates seemed to initially show MyISAM winning over > Postgres. > > However I spent some time reading what I could from the manual on > performance improvement and realized that commits might be an issue. I > moved a commit() call out of the main loop, and bang! Postgres showed a > substantial performance advantage over MyISAM. This is one of those areas where postgresql acts very differently from other dbs. In lots of other dbs big transactions are the performance killer. In PostgreSQL big transactions are the way to get better performace.
I see. Perhaps earlier I alienated some people by talking about transactions per second, which is not necessarily an appropriate unit. Thanks for the tip. -Will -----Original Message----- From: Scott Marlowe [mailto:scott.marlowe@gmail.com] Sent: 27 March 2009 16:08 To: Will Rutherdale (rutherw) Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Is there a meaningful benchmark? This is one of those areas where postgresql acts very differently from other dbs. In lots of other dbs big transactions are the performance killer. In PostgreSQL big transactions are the way to get better performace.
On Fri, Mar 27, 2009 at 2:27 PM, Will Rutherdale (rutherw) <rutherw@cisco.com> wrote: > I see. Perhaps earlier I alienated some people by talking about > transactions per second, which is not necessarily an appropriate unit. Hard to say, tps is still important. A lot of loads need to change one thing in a transaction and change it fairly fast. A lot of other loads update / insert 10,000 rows, and tps doesn't mean as much as rows inserted / second. I doubt anybody took it personally though. More likely just got confused over what you and they were measuring. Storytime! One of my best friends came from MSSQL and MySQL shop, and when introduced him to pgsql, he was bitching at how slow this file full of inserts was inserting. This was in the pg 7.0 days, when pg was not super fast, but quite fast enough if you knew a few basic tricks of the trade. I looked at his load file, which was basically one insert after another, 10k of them. I pushed him aside, typed in begin; on top and commit; on the bottom and told him to run it now. It finished in about 10 seconds. It had taken > 10 minutes before that. He's now my boss at another company, and pretty hard core pgsql fan.