Обсуждение: MySQL is faster than PgSQL but a large margin in my program... any ideas why?

Поиск
Список
Период
Сортировка

MySQL is faster than PgSQL but a large margin in my program... any ideas why?

От
Madison Kelly
Дата:
Hi all,

   On a user's request, I recently added MySQL support to my backup
program which had been written for PostgreSQL exclusively until now.
What surprises me is that MySQL is about 20%(ish) faster than PostgreSQL.

   Now, I love PostgreSQL and I want to continue recommending it as the
database engine of choice but it is hard to ignore a performance
difference like that.

   My program is a perl backup app that scans the content of a given
mounted partition, 'stat's each file and then stores that data in the
database. To maintain certain data (the backup, restore and display
values for each file) I first read in all the data from a given table
(one table per partition) into a hash, drop and re-create the table,
then start (in PostgreSQL) a bulk 'COPY..' call through the 'psql' shell
app.

   In MySQL there is no 'COPY...' equivalent so instead I generate a
large 'INSERT INTO file_info_X (col1, col2, ... coln) VALUES (...),
(blah) ... (blah);'. This doesn't support automatic quoting, obviously,
so I manually quote my values before adding the value to the INSERT
statement. I suspect this might be part of the performance difference?

   I take the total time needed to update a partition (load old data
into hash + scan all files and prepare COPY/INSERT + commit new data)
and devide by the number of seconds needed to get a score I call a
'U.Rate). On average on my Pentium3 1GHz laptop I get U.Rate of ~4/500.
On MySQL though I usually get a U.Rate of ~7/800.

   If the performace difference comes from the 'COPY...' command being
slower because of the automatic quoting can I somehow tell PostgreSQL
that the data is pre-quoted? Could the performance difference be
something else?

   If it would help I can provide code samples. I haven't done so yet
because it's a little convoluded. ^_^;

   Thanks as always!

Madison


Where the big performance concern is when

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
           Madison Kelly (Digimer)
    TLE-BU; The Linux Experience, Back Up
Main Project Page:  http://tle-bu.org
Community Forum:    http://forum.tle-bu.org
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Re: MySQL is faster than PgSQL but a large margin in my program... any ideas why?

От
Stephen Frost
Дата:
* Madison Kelly (linux@alteeve.com) wrote:
>   If the performace difference comes from the 'COPY...' command being
> slower because of the automatic quoting can I somehow tell PostgreSQL
> that the data is pre-quoted? Could the performance difference be
> something else?

I doubt the issue is with the COPY command being slower than INSERTs
(I'd expect the opposite generally, actually...).  What's the table type
of the MySQL tables?  Is it MyISAM or InnoDB (I think those are the main
alternatives)?  IIRC, MyISAM doesn't do ACID and isn't transaction safe,
and has problems with data reliability (aiui, equivilant to doing 'fsync
= false' for Postgres).  InnoDB, again iirc, is transaction safe and
whatnot, and more akin to the default PostgreSQL setup.

I expect some others will comment along these lines too, if my response
isn't entirely clear. :)

    Stephen

Вложения

Re: MySQL is faster than PgSQL but a large margin in my program... any ideas why?

От
Kevin Brown
Дата:
On Wednesday 21 December 2005 20:14, Stephen Frost wrote:
> * Madison Kelly (linux@alteeve.com) wrote:
> >   If the performace difference comes from the 'COPY...' command being
> > slower because of the automatic quoting can I somehow tell PostgreSQL
> > that the data is pre-quoted? Could the performance difference be
> > something else?
>
> I doubt the issue is with the COPY command being slower than INSERTs
> (I'd expect the opposite generally, actually...).  What's the table type
> of the MySQL tables?  Is it MyISAM or InnoDB (I think those are the main
> alternatives)?  IIRC, MyISAM doesn't do ACID and isn't transaction safe,
> and has problems with data reliability (aiui, equivilant to doing 'fsync
> = false' for Postgres).  InnoDB, again iirc, is transaction safe and
> whatnot, and more akin to the default PostgreSQL setup.
>
> I expect some others will comment along these lines too, if my response
> isn't entirely clear. :)

Is fsync() on in your postgres config?  If so, that's why you're slower.  The
default is to have it on for stability (writes are forced to disk).  It is
quite a bit slower than just allowing the write caches to do their job, but
more stable.  MySQL does not force writes to disk.


Re: MySQL is faster than PgSQL but a large margin in my

От
Madison Kelly
Дата:
Stephen Frost wrote:
> * Madison Kelly (linux@alteeve.com) wrote:
>
>>  If the performace difference comes from the 'COPY...' command being
>>slower because of the automatic quoting can I somehow tell PostgreSQL
>>that the data is pre-quoted? Could the performance difference be
>>something else?
>
>
> I doubt the issue is with the COPY command being slower than INSERTs
> (I'd expect the opposite generally, actually...).  What's the table type
> of the MySQL tables?  Is it MyISAM or InnoDB (I think those are the main
> alternatives)?  IIRC, MyISAM doesn't do ACID and isn't transaction safe,
> and has problems with data reliability (aiui, equivilant to doing 'fsync
> = false' for Postgres).  InnoDB, again iirc, is transaction safe and
> whatnot, and more akin to the default PostgreSQL setup.
>
> I expect some others will comment along these lines too, if my response
> isn't entirely clear. :)
>
>     Stephen

Ah, that makes a lot of sense (I read about the 'fsync' issue before,
now that you mention it). I am not too familiar with MySQL but IIRC
MyISAM is their open-source DB and InnoDB is their commercial one, ne?
If so, then I am running MyISAM.

   Here is the MySQL table. The main difference from the PostgreSQL
table is that the 'varchar(255)' columns are 'text' columns in PostgreSQL.

mysql> DESCRIBE file_info_1;
+-----------------+--------------+------+-----+---------+-------+
| Field           | Type         | Null | Key | Default | Extra |
+-----------------+--------------+------+-----+---------+-------+
| file_group_name | varchar(255) | YES  |     | NULL    |       |
| file_group_uid  | int(11)      |      |     | 0       |       |
| file_mod_time   | bigint(20)   |      |     | 0       |       |
| file_name       | varchar(255) |      |     |         |       |
| file_parent_dir | varchar(255) |      | MUL |         |       |
| file_perm       | int(11)      |      |     | 0       |       |
| file_size       | bigint(20)   |      |     | 0       |       |
| file_type       | char(1)      |      |     |         |       |
| file_user_name  | varchar(255) | YES  |     | NULL    |       |
| file_user_uid   | int(11)      |      |     | 0       |       |
| file_backup     | char(1)      |      | MUL | i       |       |
| file_display    | char(1)      |      |     | i       |       |
| file_restore    | char(1)      |      |     | i       |       |
+-----------------+--------------+------+-----+---------+-------+

   I will try turning off 'fsync' on my test box to see how much of a
performance gain I get and to see if it is close to what I am getting
out of MySQL. If that does turn out to be the case though I will be able
to comfortably continue recommending PostgreSQL from a stability point
of view.

Thanks!!

Madison

--
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
           Madison Kelly (Digimer)
    TLE-BU; The Linux Experience, Back Up
Main Project Page:  http://tle-bu.org
Community Forum:    http://forum.tle-bu.org
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Re: MySQL is faster than PgSQL but a large margin in

От
"Luke Lonergan"
Дата:
Madison,

On 12/21/05 10:58 PM, "Madison Kelly" <linux@alteeve.com> wrote:

> Ah, that makes a lot of sense (I read about the 'fsync' issue before,
> now that you mention it). I am not too familiar with MySQL but IIRC
> MyISAM is their open-source DB and InnoDB is their commercial one, ne?
> If so, then I am running MyISAM.

You can run either storage method with MySQL, I expect the default is
MyISAM.

COPY performance with or without fsync was sped up recently nearly double in
Postgresql.  The Bizgres version (www.bizgres.org, www.greenplum.com) is the
fastest, Postgres 8.1.1 is close, depending on how fast your disk I/O is (as
I/O speed increases Bizgres gets faster).

fsync isn't really an "issue" and I'd suggest you not run without it! We've
found that "fdatasync" as the wal sync method is actually a bit faster than
fsync if you want a bit better speed.

So, I'd recommend you upgrade to either bizgres or Postgres 8.1.1 to get the
maximum COPY speed.

>    Here is the MySQL table. The main difference from the PostgreSQL
> table is that the 'varchar(255)' columns are 'text' columns in PostgreSQL.

Shouldn't matter.

> mysql> DESCRIBE file_info_1;
> +-----------------+--------------+------+-----+---------+-------+
> | Field           | Type         | Null | Key | Default | Extra |
> +-----------------+--------------+------+-----+---------+-------+
> | file_group_name | varchar(255) | YES  |     | NULL    |       |
> | file_group_uid  | int(11)      |      |     | 0       |       |
> | file_mod_time   | bigint(20)   |      |     | 0       |       |
> | file_name       | varchar(255) |      |     |         |       |
> | file_parent_dir | varchar(255) |      | MUL |         |       |
> | file_perm       | int(11)      |      |     | 0       |       |
> | file_size       | bigint(20)   |      |     | 0       |       |
> | file_type       | char(1)      |      |     |         |       |
> | file_user_name  | varchar(255) | YES  |     | NULL    |       |
> | file_user_uid   | int(11)      |      |     | 0       |       |
> | file_backup     | char(1)      |      | MUL | i       |       |
> | file_display    | char(1)      |      |     | i       |       |
> | file_restore    | char(1)      |      |     | i       |       |
> +-----------------+--------------+------+-----+---------+-------+

What's a bigint(20)?  Are you using "numeric" in Postgresql?

>    I will try turning off 'fsync' on my test box to see how much of a
> performance gain I get and to see if it is close to what I am getting
> out of MySQL. If that does turn out to be the case though I will be able
> to comfortably continue recommending PostgreSQL from a stability point
> of view.

Again - fsync is a small part of the performance - you will need to run
either Postgres 8.1.1 or Bizgres to get good COPY speed.

- Luke