Обсуждение: Performance problems???
I have observed some disturbing behavior with the latest (7.1.3) version of PotgreSQL. In an application that I am working on with a team of folks, there is a specific need to execute a series of SQL statements similar to those used in the 'loaddb.pl' script included below. Without getting into better ways to increment rowid's (this code is part of another tool that we are using), I'd like to know why I get the following results with PostgreSQL and MySQL. In 3 separate runs I get the following PostgreSQL results: o 1 - 2000 records inserted in 12 seconds. o 2001 - 4000 records inserted in 16 seconds. o 4001 - 6000 recordsinserted in 20 seconds. You see, there is a clear performance degradation here that is associated with the number of records in the database. It appears that the main culprit is the update statement that is issued (see 'loaddb.pl' script below). This performance behavior is not expected. Especially with so few rows in such a small table. In 3 separate runs I get the following MySQL results: o 1 - 2000 records inserted in 6 seconds. o 2001 - 4000 records inserted in 5 seconds. o 4001 - 6000 records insertedin 6 seconds. You see, MySQL performs as expected. There is no performance degradation here that is related to the number of records in the database tables. I have been a huge fan and advocate of PostgreSQL. I was stunned to see this behavior. I am hoping that it is either a bug that has been fixed, or that I can alter my PostgreSQL configuration to eliminate this behavior. I have an urgent need to resolve this situation. If I cannot solve the problem soon, I will be forced to drop PostgreSQL in favor of MySQL. This is not something that I wish to do. Please help. Thanks in advance. - Jim ######################################################################## #!/usr/bin/perl -w # ## setupdb.pl # ## Simple perl script that creates the 'problemtest' db. # ## Usage: ./setupdb.pl <db-type> # ## Assumes that the 'problemtest' PostgreSQL and MySQL databases exist. ## and that there is a user 'problemtest' with proper privileges. # ######################################################################## use strict; use DBI; my $dbd; if (@ARGV) { if (uc($ARGV[0]) eq 'POSTGRESQL') { $dbd = 'Pg'; } elsif (uc($ARGV[0]) eq 'MYSQL') { $dbd ='mysql'; } else { &DoUsage(); } } else { &DoUsage(); } my $dsn = "DBI:$dbd:dbname=problemtest"; my $usr = 'problemtest'; my $pwd = 'problemtest'; my $dbh = DBI->connect($dsn,$usr,$pwd, { AutoCommit => 1, RaiseError => 1 }); $dbh->do(<<END); drop table foo END $dbh->do(<<END); drop table control END $dbh->do(<<END); create table foo ( id integer not null, primary key (id), name varchar(100)) END $dbh->do(<<END); create table control ( next_id integer not null) END $dbh->do(<<END); insert into control (next_id) values(1) END $dbh->disconnect(); sub DoUsage { print "\n\tUsage: ./setupdb.pl <db-type>\n"; print "\tWhere db-type is 'PostgreSQL' or 'MySQL'\n\n"; exit 0; } ######################################################################## #!/usr/bin/perl -w # ## loaddb.pl # ## Simple perl script to illustrate the performance degradation ## of the update statement with PostgreSQL as compared to MySQL. # ## Usage: ./loaddb.pl <db-type> <range-start> <range-end> # ######################################################################## use strict; use DBI; my $dbd; if (@ARGV == 3) { if (uc($ARGV[0]) eq 'POSTGRESQL') { $dbd = 'Pg'; } elsif (uc($ARGV[0]) eq 'MYSQL') { $dbd= 'mysql'; } else { &DoUsage(); } } else { &DoUsage(); } my $dsn = "DBI:$dbd:dbname=problemtest"; my $usr = 'problemtest'; my $pwd = 'problemtest'; my $dbh = DBI->connect($dsn,$usr,$pwd, { AutoCommit => 1, RaiseError => 1 }); my $inc_id = $dbh->prepare("update control set next_id = next_id + 1"); my $get_id = $dbh->prepare("select next_id from control"); my $insert = $dbh->prepare("insert into foo (id,name) values(?,?)"); my $start = time; foreach($ARGV[1]..$ARGV[2]){ $inc_id->execute(); $get_id->execute(); my $id = $get_id->fetchrow_array(); $insert->execute($id,"name$id"); } my $duration = time - $start; print "duration = $duration\n"; $inc_id->finish(); $get_id->finish(); $insert->finish(); $dbh->disconnect(); sub DoUsage { print "\n\tUsage: ./loaddb.pl <db-type> <range-start> <range-end>.\n"; print "\tWhere db-type is 'PostgreSQL'or 'MySQL'.\n\n"; exit 0; }
James Patterson <jpatterson@amsite.com> writes: > I have observed some disturbing behavior with the latest (7.1.3) version of > PotgreSQL. > > In an application that I am working on with a team of folks, there is a > specific need to execute a series of SQL statements similar to those used in > the 'loaddb.pl' script included below. Without getting into better ways to > increment rowid's (this code is part of another tool that we are using), I'd > like to know why I get the following results with PostgreSQL and MySQL. > > In 3 separate runs I get the following PostgreSQL results: > > o 1 - 2000 records inserted in 12 seconds. > o 2001 - 4000 records inserted in 16 seconds. > o 4001 - 6000 records inserted in 20 seconds. > > You see, there is a clear performance degradation here that is associated > with the number of records in the database. It appears that the main culprit > is the update statement that is issued (see 'loaddb.pl' script below). This > performance behavior is not expected. Especially with so few rows in such a > small table. One thing you should definitely do is wrap the entire load loop ((update/select/insert) * N) in a transaction. This will give you a huge speedup. Otherwise you are forcing a disk sync after every SQL statement. You may still see some degradation as the table size grows, but actual times should be more comparable to MySQL. > In 3 separate runs I get the following MySQL results: > > o 1 - 2000 records inserted in 6 seconds. > o 2001 - 4000 records inserted in 5 seconds. > o 4001 - 6000 records inserted in 6 seconds. > > You see, MySQL performs as expected. There is no performance degradation > here that is related to the number of records in the database tables. > > I have been a huge fan and advocate of PostgreSQL. I was stunned to see this > behavior. I am hoping that it is either a bug that has been fixed, or that I > can alter my PostgreSQL configuration to eliminate this behavior. > > I have an urgent need to resolve this situation. If I cannot solve the > problem soon, I will be forced to drop PostgreSQL in favor of MySQL. This is > not something that I wish to do. I think the main problem, or one of them, is that you're not using the proper mechanism for generating sequential numbers. If you used a real SEQUENCE instead of a one-row table you wouldn't get the MVCC penalty from updating that table thousands of times, which is part of your problem I think. I understand your issue with not wanting to change existing code, but the fact is that a sequence is the right way to do this in PostgreSQL. Updating a one-row table as you're doing requires a new copy of the row to be created each time it's updated (because of MVCC) which slows things down until VACUUM is run. Try using a sequence along with wrapping everything in a transaction (turn off autocommit and use BEGIN and COMMIT) and I think you'll be pleasantly surprised. -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
James Patterson <jpatterson@amsite.com> writes: > I have observed some disturbing behavior with the latest (7.1.3) version of > PotgreSQL. Try vacuuming the "control" table every so often --- you're accumulating huge numbers of dead rows in it. Even better, replace "control" with a sequence object. regards, tom lane
James Patterson wrote: > > I have observed some disturbing behavior with the latest (7.1.3) version of > PotgreSQL. > > In an application that I am working on with a team of folks, there is a > specific need to execute a series of SQL statements similar to those used in > the 'loaddb.pl' script included below. Without getting into better ways to > increment rowid's (this code is part of another tool that we are using), I'd > like to know why I get the following results with PostgreSQL and MySQL. > > In 3 separate runs I get the following PostgreSQL results: > > o 1 - 2000 records inserted in 12 seconds. > o 2001 - 4000 records inserted in 16 seconds. > o 4001 - 6000 records inserted in 20 seconds. > > You see, there is a clear performance degradation here that is associated > with the number of records in the database. It appears that the main culprit > is the update statement that is issued (see 'loaddb.pl' script below). This > performance behavior is not expected. Especially with so few rows in such a > small table. > > In 3 separate runs I get the following MySQL results: > > o 1 - 2000 records inserted in 6 seconds. > o 2001 - 4000 records inserted in 5 seconds. > o 4001 - 6000 records inserted in 6 seconds. > > You see, MySQL performs as expected. There is no performance degradation > here that is related to the number of records in the database tables. > > I have been a huge fan and advocate of PostgreSQL. I was stunned to see this > behavior. I am hoping that it is either a bug that has been fixed, or that I > can alter my PostgreSQL configuration to eliminate this behavior. > > I have an urgent need to resolve this situation. If I cannot solve the > problem soon, I will be forced to drop PostgreSQL in favor of MySQL. This is > not something that I wish to do. You really should us e a sequence. You will most likely need to change the way you create sequence numbers even for mysql as the following is not safe on non-transactional DB. > my $inc_id = $dbh->prepare("update control set next_id = next_id + 1"); > my $get_id = $dbh->prepare("select next_id from control"); if two backends happen to interleave their queries 1> my $inc_id = $dbh->prepare("update control set next_id = next_id + 1"); 2> my $inc_id = $dbh->prepare("update control set next_id = next_id + 1"); 1> my $get_id = $dbh->prepare("select next_id from control"); 2> my $get_id = $dbh->prepare("select next_id from control"); then both will get the same next_id which is probably not what you want. ------------- Hannu