Обсуждение: Re: MySQL is faster than PgSQL but a large margin in
What version of postgres?
Copy has been substantially improved in bizgres and also in 8.1.
- Luke
--------------------------
Sent from my BlackBerry Wireless Device
-----Original Message-----
From: pgsql-performance-owner@postgresql.org <pgsql-performance-owner@postgresql.org>
To: pgsql-performance@postgresql.org <pgsql-performance@postgresql.org>
Sent: Wed Dec 21 21:03:18 2005
Subject: [PERFORM] MySQL is faster than PgSQL but a large margin in my program... any ideas why?
Hi all,
On a user's request, I recently added MySQL support to my backup
program which had been written for PostgreSQL exclusively until now.
What surprises me is that MySQL is about 20%(ish) faster than PostgreSQL.
Now, I love PostgreSQL and I want to continue recommending it as the
database engine of choice but it is hard to ignore a performance
difference like that.
My program is a perl backup app that scans the content of a given
mounted partition, 'stat's each file and then stores that data in the
database. To maintain certain data (the backup, restore and display
values for each file) I first read in all the data from a given table
(one table per partition) into a hash, drop and re-create the table,
then start (in PostgreSQL) a bulk 'COPY..' call through the 'psql' shell
app.
In MySQL there is no 'COPY...' equivalent so instead I generate a
large 'INSERT INTO file_info_X (col1, col2, ... coln) VALUES (...),
(blah) ... (blah);'. This doesn't support automatic quoting, obviously,
so I manually quote my values before adding the value to the INSERT
statement. I suspect this might be part of the performance difference?
I take the total time needed to update a partition (load old data
into hash + scan all files and prepare COPY/INSERT + commit new data)
and devide by the number of seconds needed to get a score I call a
'U.Rate). On average on my Pentium3 1GHz laptop I get U.Rate of ~4/500.
On MySQL though I usually get a U.Rate of ~7/800.
If the performace difference comes from the 'COPY...' command being
slower because of the automatic quoting can I somehow tell PostgreSQL
that the data is pre-quoted? Could the performance difference be
something else?
If it would help I can provide code samples. I haven't done so yet
because it's a little convoluded. ^_^;
Thanks as always!
Madison
Where the big performance concern is when
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Madison Kelly (Digimer)
TLE-BU; The Linux Experience, Back Up
Main Project Page: http://tle-bu.org
Community Forum: http://forum.tle-bu.org
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly
Luke Lonergan wrote:
> What version of postgres?
>
> Copy has been substantially improved in bizgres and also in 8.1.
> - Luke
Currently 7.4 (what comes with Debian Sarge). I have run my program on
8.0 but not since I have added MySQL support. I should run the tests on
the newer versions of both DBs (using v4.1 for MySQL which is also
mature at this point).
As others mentioned though, so far the most likely explanation is the
'fsync' being enabled on PostgreSQL.
Thanks for the reply!
Madison
--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Madison Kelly (Digimer)
TLE-BU; The Linux Experience, Back Up
Main Project Page: http://tle-bu.org
Community Forum: http://forum.tle-bu.org
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Madison, On 12/21/05 11:02 PM, "Madison Kelly" <linux@alteeve.com> wrote: > Currently 7.4 (what comes with Debian Sarge). I have run my program on > 8.0 but not since I have added MySQL support. I should run the tests on > the newer versions of both DBs (using v4.1 for MySQL which is also > mature at this point). Yes, this is *definitely* your problem. Upgrade to Postgres 8.1.1 or Bizgres 0_8_1 and your COPY speed could double without even changing fsync (depending on your disk speed). We typically get 12-14MB/s from Bizgres on Opteron CPUs and disk subsystems that can write at least 60MB/s. This means you can load 100GB in 2 hours. Note that indexes will also slow down loading. - Luke
Hi, Madison, Hi, Luke, Luke Lonergan wrote: > Note that indexes will also slow down loading. For large loading bunches, it often makes sense to temporarily drop the indices before the load, and recreate them afterwards, at least, if you don't have normal users accessing the database concurrently. Markus -- Markus Schaber | Logical Tracking&Tracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org
Agreed. I have a 13 million row table that gets a 100,000 new records every week. There are six indexes on this table. Right about the time when it reached the 10 million row mark updating the table with new records started to take many hours if I left the indexes in place during the update. Indeed there was even some suspicion that the indexes were starting to get corrupted during the load. So I decided to fist drop the indexes when I needed to update the table. Now inserting 100,000 records into the table is nearly instantaneous although it does take me a couple of hours to build the indexes anew. This is still big improvement since at one time it was taking almost 12 hours to update the table with the indexes in place. Juan On Thursday 22 December 2005 08:34, Markus Schaber wrote: > Hi, Madison, > Hi, Luke, > > Luke Lonergan wrote: > > Note that indexes will also slow down loading. > > For large loading bunches, it often makes sense to temporarily drop the > indices before the load, and recreate them afterwards, at least, if you > don't have normal users accessing the database concurrently. > > Markus
On Dec 22, 2005, at 9:44 PM, Juan Casero wrote: > Agreed. I have a 13 million row table that gets a 100,000 new > records every > week. There are six indexes on this table. Right about the time > when it i have some rather large tables that grow much faster than this (~1 million per day on a table with > 200m rows) and a few indexes. there is no such slowness I see. do you really need all those indexes?