Обсуждение: Re: MySQL or Postgres ?

Поиск
Список
Период
Сортировка

Re: MySQL or Postgres ?

От
antti@exadata.fi (Antti Halonen)
Дата:
Fabiàn R. Breschi <fabian.breschi@acm.org> wrote in message news:<3D48185C.B6663ED7@acm.org>...
> Hi there,
>
> If you'd have to suggest which type of database one migh use, how would
> you reflect the differences between them to show up pros and cons ?

Some thoughts I posted last night to another group under basically same subject:

MySQL has pretty robust replication and multiple ways to handle
backups. Actually you now got hotbackup for MySQL if you run on innodb
tables (www.innodb.com), which you probably should, as it offers
transactions, row-level locking and rough foreign key implementation.
Hotbackup means that you can backup your live database on the fly.

AFAIK, Postgre don't have replication. Great Bridge gang was doing
something but it never happend I think.

For basic backups both of them should be fine. Note that with Postgre,
if you get lot's of updates you have to run vacuum to clean up the
transaction mess, and this basically halts the db.

On some situations MySQL is slower, like some multi-join
queries and particularly when running on MyISAM tables, in situations when
select's and updates are coming in on a same table in a steady flow.
This is because MyISAM handles locking on a _table_ level, which is
not that nice and makes MySQL not so scalable on those situations. Now
again if you instead use Inno tables you have no problems because of
the row locking.

Personally, I have not used Postgre in really high-traffic situations,
but on some situations it simply is better because of it's feature
set. Data integrity specifically being one of them. In many cases the
most important thing is INTEGRITY on a server level. Performance comes
next. And no matter what some MySQL idealists say; in some cases you
simply cannot impliment integrity checks on a client-level (like
making your client do say, foreign key checks). What if you have many
different type of client applications modifying the same data? What if
you have to use some GUI client where nothing can be implimented? Whatever.
Besides, that's inheritently wrong thinking anyways.

It all boils down what is important to _you_. Most of the database
implementations are unique so making comparisons is pointless. Do you
need foreign keys? Do you need sub-selects? No? Stored procedures? Do
you need speed on bulk inserts? Speed on certain select's? If you are
running website and clients are doing the same queries most of the
time, then maybe query cache might help?

Bottom line:
MySQL
- Fast & reliable - has proven this in real-world situations
- Capable of handling big tables efficiently. And this is so no matter
what  some whiners say. I've seen 200G _single_ table. Many
warehousers running with 10(ns) of gig tables without any problems
- Lacking many major features: stored procs, foreign keys,
sub-selects...
- Some neat features like: fulltext indexing, replication, query
cache..
- Library version of the server! You can actually link it inside your
app (no admin worries, no client-server overhead).

Postgre
- Tons of features
- Even more features
- And then some...
- Seems to be stable and scale nicely
- More that Oracle like industry db feeling :)
- If you want to extend / modify the server code, it's very well
commented = easy to read :)
- No replication, no fulltext index, no query cache

Hopefully this is for any help!

Best,
Antti Halonen          # Tietokanta & Tietoverkko Konsultointia
Exadata              # Database & Network Consulting
                                  http://www.exadata.fi

Re: MySQL or Postgres ?

От
Andrew Sullivan
Дата:
You may want to re-post your info for the benefit of your readers.
For specifics, see below.

On Thu, Aug 01, 2002 at 09:39:10AM -0700, Antti Halonen wrote:

> AFAIK, Postgre don't have replication. Great Bridge gang was doing
> something but it never happend I think.

There are a couple of answers for replication.  The rserv code in
contrib/ is a bother to set up, but it works.  It's slomewhat slow
and not bulletproof, however.

PostgreSQL, Inc has an improved version of this code.  It works very
well.  It is currently under commercial license.  (By the way, I have
mentioned on here more than once that we re-implemented some of the
Perl code from PostgreSQL, Inc., in Java.  I was having a litte
trouble with our re-implementation; but the problem was a
badly-documented config file.  Anyway, if you're interested in info
about either the Java or Perl versions, you should talk to
PostgresSQL, Inc.)

I gather that the dbmirror code works, too.

> For basic backups both of them should be fine. Note that with Postgre,
> if you get lot's of updates you have to run vacuum to clean up the
> transaction mess, and this basically halts the db.

Not as of 7.2.

> - No replication, no fulltext index, no query cache

The fulltext index is also false.  There are modules in contrib/ to
handle that.  Others here have used them, but I haven't, so I can't
speak to how good they are.

A

--
----
Andrew Sullivan                               87 Mowat Avenue
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M6K 3E3
                                         +1 416 646 3304 x110


Re: MySQL or Postgres ?

От
"scott.marlowe"
Дата:
Couple of comments.  Not flamage, just more info.

On 1 Aug 2002, Antti Halonen wrote:

> Fabiàn R. Breschi <fabian.breschi@acm.org> wrote in message news:<3D48185C.B6663ED7@acm.org>...
> > Hi there,
> >
> > If you'd have to suggest which type of database one migh use, how would
> > you reflect the differences between them to show up pros and cons ?
>
> Some thoughts I posted last night to another group under basically same subject:
>
> MySQL has pretty robust replication and multiple ways to handle
> backups. Actually you now got hotbackup for MySQL if you run on innodb
> tables (www.innodb.com), which you probably should, as it offers
> transactions, row-level locking and rough foreign key implementation.
> Hotbackup means that you can backup your live database on the fly.

Note that hot backups of innodb are not free, and are not done with free
software.  http://www.innodb.com/hotbackup.html says that it's 400 euros a
year, or 1000 euros in perpetuity for the backup tool.

> AFAIK, Postgre don't have replication. Great Bridge gang was doing
> something but it never happend I think.

There are (at least) two working means of replication available.
One ships with the postgresql-7.x.x.tar.gz file in the contrib/rserv
directory.  The others are available on the the web site.

I've heard the symantic argument before that "if it's not built in it
doesn't count" which is really silly.  Being built-in does not assure
proper operation and integration any more than not being built in assures
improper operation.  I'll take a replication system (for any database, not
just postgresql) that works, and try it out and test it.  And if there's
more than one, there's probably a good reason for it, i.e different design
goals.  P.s. this isn't posted against your message, but it a pre-emptive
answer to that argument should someone feel the need to bring it up.

> For basic backups both of them should be fine. Note that with Postgre,
> if you get lot's of updates you have to run vacuum to clean up the
> transaction mess, and this basically halts the db.

Note that ALL postgresql backups are hot backups with transactional
integrity.  There is no other way to really do them with the delivered
tool set, and no need to do them any other way.  I.e. you ALWAYS get a
transaction safe dataset from a backup, no matter how many connections are
open and running transactions when you do the backup.  Did I mention that
tool is free?

Bzzzzzt.  Wrong.  But thanks for playing.  :-) Since 7.2 plain vacuums do
not block much of anything.  On my little test box, pgbench goes from ~70
tps to ~65 tps when I run a script that runs a vacuum in the background
continuously with a 1 second break between runs.

Note that 7.1.x had some truly heinous behavior in a heavily transactional
environment running background vacuums, so until 7.2.x came out, your
statement was very much true.

> On some situations MySQL is slower, like some multi-join
> queries and particularly when running on MyISAM tables, in situations when
> select's and updates are coming in on a same table in a steady flow.
> This is because MyISAM handles locking on a _table_ level, which is
> not that nice and makes MySQL not so scalable on those situations. Now
> again if you instead use Inno tables you have no problems because of
> the row locking.

Keep in mind that a table lock may be no big deal on a system with only a
few connections running at once, but when the number of connections runs
into the hundreds or thousands, the response time on the database will
increase in a rather non-linear fashion.  Note that ISAM table locks block
readers as well as writers, so that is why heavy parallel access on a
table locking database is usually a bad thing (TM).  I.e. even a tiny
percentage of your queries need be updates/inserts/deletes to slow down
the server to a crawl.

Innodb would appear to be the answer to this problem.

> Personally, I have not used Postgre in really high-traffic situations,

I've tested it under high traffic.  It's now finally, at 7.2.x capable of
true 24/7 operation with background vacuuming via a cron job.  My
company's intranet site runs on it, but we only average maybe 5 to 10
backends open at a time max.  We can handle about 200 comfortably on our
jack of all trades web server / database machine with 512 Meg ram.  We can
handle, slowly, about 500 to 700 connections.  Under heavy parallel
testing (>200 concurrent users) we have had no problems with Postgresql.

> but on some situations it simply is better because of it's feature
> set. Data integrity specifically being one of them. In many cases the
> most important thing is INTEGRITY on a server level. Performance comes
> next. And no matter what some MySQL idealists say; in some cases you
> simply cannot impliment integrity checks on a client-level (like
> making your client do say, foreign key checks). What if you have many
> different type of client applications modifying the same data? What if
> you have to use some GUI client where nothing can be implimented? Whatever.
> Besides, that's inheritently wrong thinking anyways.

Agreed.  I wish more people understood the issues caused by concurrent
upates well enough to prevent the problems they run into trying to do
database stuff in their client code.  Note that if you are willing to
apply table locks willy nilly, you could theoretically write transactions
in client code on a MySQL ISAM table.  But your performance would be
horrible under any kind of parallel load due to excessive table level
locking.  And you'd still probably have to look out for odd case
concurrent updates that could bite you in the butt about 1 in every 1*10^6
accesses or so.

> It all boils down what is important to _you_. Most of the database
> implementations are unique so making comparisons is pointless. Do you
> need foreign keys? Do you need sub-selects? No? Stored procedures? Do
> you need speed on bulk inserts? Speed on certain select's? If you are
> running website and clients are doing the same queries most of the
> time, then maybe query cache might help?

Agreed again.  This is another way of saying, you decide what data points
are important in your design phase, then look for a database that has the
most datapoints that match up with your requirements, then test it for
applicability and performance, and accept or reject it based on your
own testing.  The best all these MySQL versus Postgresql versus brand X
database comparisons can ever hope to do is provide a kind of jumping off
point for deciding which package to evaluate.

My final point would be that if you benchmarked postgresql before version
7.2.x, then your data is out of date.  So far, I've found 7.2.x to be on
par with MSSQL, MySQL, and most other databases for speed nowadays, if not
faster for a lot of things.  And that's strictly testing single user
access.  In a more concurrent environment, it really shines.

Scott Marlowe


Re: MySQL or Postgres ?

От
Daniel Wickstrom
Дата:
>>>>> "Andrew" == Andrew Sullivan <andrew@libertyrms.info> writes:


    Andrew> The fulltext index is also false.  There are modules in
    Andrew> contrib/ to handle that.  Others here have used them, but
    Andrew> I haven't, so I can't speak to how good they are.

Yes, postgresql support for FTI is quite good.  Check out tsearch and
openfts for starters.

Regards,

Dan