Обсуждение: Tuning/performance issue...

Поиск
Список
Период
Сортировка

Tuning/performance issue...

От
David Griffiths
Дата:
We're having a problem with a query during our investigation into Postgres (as an Oracle replacement). This query Postgres takes 20-40 seconds (multiple runs). Tom Lan recommended I post it here, with an explain-analyze.
 
Here's the query:
 
EXPLAIN ANALYZE SELECT company_name, address_1, address_2, address_3, city,
address_list.state_province_id, state_province_short_desc, country_desc, zip_code, address_list.country_id,
contact_info.email, commercial_entity.user_account_id, phone_num_1, phone_num_fax, website, boats_website
FROM commercial_entity, country, user_account,
address_list LEFT JOIN state_province ON address_list.state_province_id = state_province.state_province_id
LEFT JOIN contact_info ON address_list.contact_info_id = contact_info.contact_info_id
WHERE address_list.address_type_id = 101
AND commercial_entity.commercial_entity_id=225528
AND commercial_entity.commercial_entity_id = address_list.commercial_entity_id
AND address_list.country_id = country.country_id
AND commercial_entity.user_account_id = user_account.user_account_id
AND user_account.user_role_id IN (101, 101);
 
Here's the explain:
 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..64570.33 rows=1 width=385) (actual time=42141.08..42152.06 rows=1 loops=1)
   ->  Nested Loop  (cost=0.00..64567.30 rows=1 width=361) (actual time=42140.80..42151.77 rows=1 loops=1)
         ->  Nested Loop  (cost=0.00..64563.97 rows=1 width=349) (actual time=42140.31..42151.27 rows=1 loops=1)
               Join Filter: ("outer".commercial_entity_id = "inner".commercial_entity_id)
               ->  Index Scan using commercial_entity_pkey on commercial_entity  (cost=0.00..5.05 rows=1 width=94) (actual time=0.57..0.58 rows=1 loops=1)
                     Index Cond: (commercial_entity_id = 225528::numeric)
               ->  Materialize  (cost=63343.66..63343.66 rows=97221 width=255) (actual time=41741.96..41901.17 rows=90527 loops=1)
                     ->  Merge Join  (cost=0.00..63343.66 rows=97221 width=255) (actual time=1.44..41387.68 rows=90527 loops=1)
                           Merge Cond: ("outer".contact_info_id = "inner".contact_info_id)
                           ->  Nested Loop  (cost=0.00..830457.52 rows=97221 width=222) (actual time=0.95..39178.32 rows=90527 loops=1)
                                 Join Filter: ("outer".state_province_id = "inner".state_province_id)
                                 ->  Index Scan using addr_list_ci_id_i on address_list  (cost=0.00..586676.65 rows=97221 width=205) (actual time=0.49..2159.90 rows=90527 loops=1)
                                       Filter: (address_type_id = 101::numeric)
                                 ->  Seq Scan on state_province  (cost=0.00..1.67 rows=67 width=17) (actual time=0.00..0.21 rows=67 loops=90527)
                           ->  Index Scan using contact_info_pkey on contact_info  (cost=0.00..3366.76 rows=56435 width=33) (actual time=0.44..395.75 rows=55916 loops=1)
         ->  Index Scan using user_account_pkey on user_account  (cost=0.00..3.32 rows=1 width=12) (actual time=0.46..0.46 rows=1 loops=1)
               Index Cond: ("outer".user_account_id = user_account.user_account_id)
               Filter: (user_role_id = 101::numeric)
   ->  Index Scan using country_pkey on country  (cost=0.00..3.01 rows=1 width=24) (actual time=0.25..0.25 rows=1 loops=1)
         Index Cond: ("outer".country_id = country.country_id)
 Total runtime: 42165.44 msec
(21 rows)
 
 
I will post the schema in a seperate email - the list has rejected one big email 3 times now.
 
David

Re: Tuning/performance issue...

От
Tom Lane
Дата:
David Griffiths <dgriffiths@boats.com> writes:
> ... FROM commercial_entity, country, user_account,
> address_list LEFT JOIN state_province ON address_list.state_province_id
> = state_province.state_province_id
> LEFT JOIN contact_info ON address_list.contact_info_id =
> contact_info.contact_info_id
> WHERE ...

I believe what you're getting burnt by is that PG's planner interprets
this as forcing the address_list * state_province * contact_info join
to be done before it joins those tables to commercial_entity, country,
and user_account --- for discussion see
http://www.postgresql.org/docs/7.3/static/explicit-joins.html

Unfortunately your WHERE-clause restriction conditions are on
address_list, commercial_entity, and user_account; and it seems the
address_list constraint is very weak.  So the plan ends up forming a
large fraction of the address_list * state_province * contact_info join,
only to throw it away again when there's no matching rows selected from
commercial_entity and user_account.  The actual runtime and actual row
counts from the EXPLAIN ANALYZE output show that this is what's
happening.

The most efficient way to handle this query would probably be to join
the three tables with restrictions first, and then join the other tables
to those.  You could force this with not too much rewriting using
something like (untested, but I think it's right)

... FROM commercial_entity CROSS JOIN user_account CROSS JOIN
address_list LEFT JOIN state_province ON address_list.state_province_id
= state_province.state_province_id
LEFT JOIN contact_info ON address_list.contact_info_id =
contact_info.contact_info_id
CROSS JOIN country
WHERE ...

The explicit JOINs associate left-to-right, so this gives the intended
join order.  (In your original query, explicit JOIN binds more tightly
than commas do.)

The reason PG's planner doesn't discover this join order for itself
is that it's written to not attempt to re-order outer joins from the
syntactically defined ordering.  In general, such reordering would
change the results.  It is possible to analyze the query and prove that
certain reorderings are valid (don't change the results), but we don't
currently have code to do that.

> As a reference, our production Oracle database (exactly the same
> hardware, but RAID-mirroring) with way more load can handle the query in
> 1-2 seconds. I have MySQL 4.0.14 with InnoDB on the same machine
> (shutdown when I am testing Postgres, and visa versa) and it does the
> query in 0.20 seconds.

I'm prepared to believe that Oracle contains code that actually does the
analysis about which outer-join reorderings are valid, and is then able
to find the right join order by deduction.  The last I heard about
MySQL, they have no join-order analysis at all; they unconditionally
interpret this type of query left-to-right, ie as

... FROM ((((commercial_entity CROSS JOIN country) CROSS JOIN
         user_account) CROSS JOIN address_list)
         LEFT JOIN state_province ON ...)
         LEFT JOIN contact_info ON ...
WHERE ...

This is clearly at odds with the SQL spec's syntactically defined join
order semantics.  It's possible that it always yields the same results
as the spec requires, but I'm not at all sure about that.  In any case
this strategy is certainly not "better" than ours, it just performs
poorly on a different set of queries.  Would I be out of line to
speculate that your query was previously tuned to work well in MySQL?

            regards, tom lane

Re: Tuning/performance issue...

От
David Griffiths
Дата:
> The most efficient way to handle this query would probably be to join
> the three tables with restrictions first, and then join the other tables
> to those.  You could force this with not too much rewriting using
> something like (untested, but I think it's right)
>
> ... FROM commercial_entity CROSS JOIN user_account CROSS JOIN
> address_list LEFT JOIN state_province ON address_list.state_province_id
> = state_province.state_province_id
> LEFT JOIN contact_info ON address_list.contact_info_id =
> contact_info.contact_info_id
> CROSS JOIN country
> WHERE ...
>
> The explicit JOINs associate left-to-right, so this gives the intended
> join order.  (In your original query, explicit JOIN binds more tightly
> than commas do.)

Ok - that's interesting - I'll have to do some reading and more testing.

> The reason PG's planner doesn't discover this join order for itself
> is that it's written to not attempt to re-order outer joins from the
> syntactically defined ordering.  In general, such reordering would
> change the results.  It is possible to analyze the query and prove that
> certain reorderings are valid (don't change the results), but we don't
> currently have code to do that.

Not sure I follow. Are you saying that, depending on when the outer-join is
applied to the rows found at the time, you may end up with a different set
of rows? I would have expected the optimizer to do the outer-joins last, as
the extra data received by the outer-joins is not mandatory, and won't
affect
the rows that were retreived by joining user_account, address_list, and
commercial_entity.

An outer join would *never* be the most restrictive
join in a query. I thought (from my readings on Oracle query tuning) that
finding the most restrictive table/index was the first task of an optimizer.
Reduce the result set as quickly as possible. That query has the line,

"AND commercial_entity.commercial_entity_id=225528",

which uses an index (primary key) and uses an "=". I would have expected
that to be done first, then joined with the other inner-join tables, and
finally
have the outer-joins applied to the final result set to fill in the "might
be there" data.

Anyway, if the optimizer does the outer-joins first (address_list with
state_province
and contact_info), then it's picking the table with the most rows
(address_list has
200K+ rows, where the other 3 big tables have 70K-90K). Would re-ordering
the FROM clause (and LEFT JOIN portions) help?

Could you give an example where applying an outer-join at a different time
could
result in different results? I think I can see at situation where you use
part of the results
in the outer-join in the where clause, but I am not sure.

> I'm prepared to believe that Oracle contains code that actually does the
> analysis about which outer-join reorderings are valid, and is then able
> to find the right join order by deduction.

I'm not sure about Oracle (other than what I stated above). In fact, about
half
the time, updating table stats to try to get the Oracle optimizer to do a
better
job on a query results in even worse performance.

> ... FROM ((((commercial_entity CROSS JOIN country) CROSS JOIN
>          user_account) CROSS JOIN address_list)
>          LEFT JOIN state_province ON ...)
>          LEFT JOIN contact_info ON ...
> WHERE ...
>
> This is clearly at odds with the SQL spec's syntactically defined join
> order semantics.  It's possible that it always yields the same results
> as the spec requires, but I'm not at all sure about that.

Again, I don't know. On the 3 queries based on these tables, Postgres
and MySQL return the exact same data (they use the same data set).

Do you have a link to the SQL spec's join-order requirements?

> In any case
> this strategy is certainly not "better" than ours, it just performs
> poorly on a different set of queries.  Would I be out of line to
> speculate that your query was previously tuned to work well in MySQL?

The query was pulled from our codebase (written for Oracle). I added a bit
to it
to make it slower, and then ported to MySQL and tested there first (just
re-wrote
the outer-join syntax). I found that  re-ordering the tables in the
from-clause on
MySQL changed the time by 45-ish% (0.36 seconds to .20 seconds), but that's
because I had forgotten to re-analyze the tables after refreshing the
dataset.
Now, table order doesn't make a difference in speed (or results).

If anything, I've done more tuning for Postgres - added some extra indexes
to try to help
(country.country_id had a composite index with another column, but not an
index for
just it), etc.

The dataset and schema is pure-Oracle. I extracted it out of the database,
removed all
Oracle-specific extensions, changed the column types, and migrated the
indexes and
foreign keys to MySQL and Postgres. Nothing more (other than an extra index
or two for Postgres - nada for MySQL).

This is all part of a "migrate away from Oracle" project. We are looking at
3 databases -
MySQL (InnoDB), Postgres and Matisse (object oriented). We have alot of
queries like this
or worse, and I'm worried that many of them would need to be re-written. The
developers
know SQL, but nothing about tuning, etc.

Thanks for the quick response - I will try explicit joining, and I'm looking
forward to
your comments on outer-joins and the optmizer (and anything else I've
written).

David.

Re: Tuning/performance issue...

От
Jeff
Дата:
On Tue, 30 Sep 2003, David Griffiths wrote:

>
> This is all part of a "migrate away from Oracle" project. We are looking at
> 3 databases -
> MySQL (InnoDB), Postgres and Matisse (object oriented). We have alot of
> queries like this
> or worse, and I'm worried that many of them would need to be re-written. The
> developers
> know SQL, but nothing about tuning, etc.
>

There's a movement at my company to ditch several commercial db's in favor
of a free one.  I'm currently the big pg fan around here and I've actually
written a rather lengthy presentation about pg features, why, tuning, etc.
but another part was some comparisons to other db's..

I decided so I wouldn't be blinding flaming mysql to give it a whirl and
loaded it up with the same dataset as pg.  First thing I hit was lack of
stored procedures.   But I decided to code around that, giving mysql the
benefit of the doubt.  What I found was interesting.

For 1-2 concurrent
'beaters' it screamed. ultra-fast.  But.. If you increase the concurrent
beaters up to say, 20 Mysql comes to a grinding halt.. Mysql and the
machine itself become fairly unresponsive.  And if you do cache unfriendly
queries it becomes even worse.   On PG - no problems at all. Scaled fine
and dandy up.  And with 40 concurrent beaters the machine was still
responsive.  (The numbers for 20 client was 220 seconds (pg) and 650
seconds (mysql))

So that is another test to try out - Given your configuration I expect you
have lots of concurrent activity.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/



Re: Tuning/performance issue...

От
Tom Lane
Дата:
David Griffiths <dgriffiths@boats.com> writes:
>> The reason PG's planner doesn't discover this join order for itself
>> is that it's written to not attempt to re-order outer joins from the
>> syntactically defined ordering.  In general, such reordering would
>> change the results.  It is possible to analyze the query and prove that
>> certain reorderings are valid (don't change the results), but we don't
>> currently have code to do that.

> Not sure I follow. Are you saying that, depending on when the outer-join is
> applied to the rows found at the time, you may end up with a different set
> of rows?

Here's an example showing that it's not always safe to rearrange join
order in the presence of outer joins:

jtest=# create table a (f1 int);
CREATE TABLE
jtest=# create table b (f1 int, f2 int);
CREATE TABLE
jtest=# create table c(f1 int, f2 int);
CREATE TABLE
jtest=# insert into a values (1);
INSERT 431307 1
jtest=# insert into b values (10,10);
INSERT 431308 1
jtest=# insert into b values (11,11);
INSERT 431309 1
jtest=# insert into c values (1,10);
INSERT 431310 1
jtest=# insert into c values (2,11);
INSERT 431311 1

jtest=# SELECT * FROM a, b LEFT JOIN c ON b.f2 = c.f2 WHERE a.f1 = c.f1;
 f1 | f1 | f2 | f1 | f2
----+----+----+----+----
  1 | 10 | 10 |  1 | 10
(1 row)

Per spec the JOIN operator binds more tightly than comma, so this is
equivalent to:

jtest=# SELECT * FROM a JOIN (b LEFT JOIN c ON b.f2 = c.f2) ON a.f1 = c.f1;
 f1 | f1 | f2 | f1 | f2
----+----+----+----+----
  1 | 10 | 10 |  1 | 10
(1 row)

Now suppose we try to join A and C before joining to B:

jtest=# SELECT * FROM b LEFT JOIN (a join c ON a.f1 = c.f1) ON b.f2 = c.f2;
 f1 | f2 | f1 | f1 | f2
----+----+----+----+----
 10 | 10 |  1 |  1 | 10
 11 | 11 |    |    |
(2 rows)

We get a different answer, because some C rows are eliminated before
reaching the left join, causing null-extended B rows to be added.

(I don't have a MySQL installation here to try, but if they still work
the way they used to, they get the wrong answer on the first query.)

The point of this example is just that there are cases where it'd be
incorrect for the planner to change the ordering of joins from what
is implied by the query syntax.  It is always safe to change the join
order when only inner joins are involved.  There are cases where outer
join order is safe to change too, but you need analysis code that checks
the query conditions to prove that a particular rearrangement is safe.
Right now, we don't have such code, and so we just follow the simple
rule "never rearrange any outer joins".

> I would have expected the optimizer to do the outer-joins last, as the
> extra data received by the outer-joins is not mandatory, and won't
> affect the rows that were retreived by joining user_account,
> address_list, and commercial_entity.

I think your example falls into the category of provably-safe
rearrangements ... but as I said, the planner doesn't know that.

> An outer join would *never* be the most restrictive
> join in a query.

Sure it can, if the restriction conditions are mainly on the outer
join's tables.  But that's not really the issue here.  As best I can
tell without seeing your data statistics, the most restrictive
conditions in your query are the ones on
commercial_entity.commercial_entity_id and user_account.user_role_id.
The trick is to apply those before joining any other tables.

            regards, tom lane

Re: Tuning/performance issue...

От
Oleg Lebedev
Дата:
Jeff,
I would really appreciate if you could send me that lengthy presentation
that you've written on pg/other dbs comparison.
Thanks.

Oleg

-----Original Message-----
From: Jeff [mailto:threshar@torgo.978.org]
Sent: Wednesday, October 01, 2003 6:23 AM
To: David Griffiths
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Tuning/performance issue...
Importance: Low


On Tue, 30 Sep 2003, David Griffiths wrote:

>
> This is all part of a "migrate away from Oracle" project. We are
> looking at 3 databases - MySQL (InnoDB), Postgres and Matisse (object
> oriented). We have alot of queries like this
> or worse, and I'm worried that many of them would need to be
re-written. The
> developers
> know SQL, but nothing about tuning, etc.
>

There's a movement at my company to ditch several commercial db's in
favor of a free one.  I'm currently the big pg fan around here and I've
actually written a rather lengthy presentation about pg features, why,
tuning, etc. but another part was some comparisons to other db's..

I decided so I wouldn't be blinding flaming mysql to give it a whirl and
loaded it up with the same dataset as pg.  First thing I hit was lack of
stored procedures.   But I decided to code around that, giving mysql the
benefit of the doubt.  What I found was interesting.

For 1-2 concurrent
'beaters' it screamed. ultra-fast.  But.. If you increase the concurrent
beaters up to say, 20 Mysql comes to a grinding halt.. Mysql and the
machine itself become fairly unresponsive.  And if you do cache
unfriendly
queries it becomes even worse.   On PG - no problems at all. Scaled fine
and dandy up.  And with 40 concurrent beaters the machine was still
responsive.  (The numbers for 20 client was 220 seconds (pg) and 650
seconds (mysql))

So that is another test to try out - Given your configuration I expect
you have lots of concurrent activity.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if
your
      joining column's datatypes do not match

*************************************

This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network.

*************************************

Re: Tuning/performance issue...

От
Jeff
Дата:
On Wed, 1 Oct 2003, Oleg Lebedev wrote:

> Jeff,
> I would really appreciate if you could send me that lengthy presentation
> that you've written on pg/other dbs comparison.
> Thanks.
>

After I give the presentation at work and collect comments from my
coworkers (and remove some information you folks don't need to know :) I
will be very willing to post it for people to see.


--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/



Re: Tuning/performance issue...

От
Oleg Lebedev
Дата:
That would be great! When do you think this would be ready for us to see
;?)

-----Original Message-----
From: Jeff [mailto:threshar@torgo.978.org]
Sent: Wednesday, October 01, 2003 11:42 AM
To: Oleg Lebedev
Cc: pgsql-performance@postgresql.org
Subject: RE: [PERFORM] Tuning/performance issue...


On Wed, 1 Oct 2003, Oleg Lebedev wrote:

> Jeff,
> I would really appreciate if you could send me that lengthy
> presentation that you've written on pg/other dbs comparison. Thanks.
>

After I give the presentation at work and collect comments from my
coworkers (and remove some information you folks don't need to know :) I
will be very willing to post it for people to see.


--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

*************************************

This e-mail may contain privileged or confidential material intended for the named recipient only.
If you are not the named recipient, delete this message and all attachments.
Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
We reserve the right to monitor e-mail sent through our network.

*************************************

Re: Tuning/performance issue...

От
Bruce Momjian
Дата:
I have updated the FAQ to be:

      In comparison to MySQL or leaner database systems, we are
      faster for multiple users, complex queries, and a read/write query
      load.  MySQL is faster for SELECT queries done by a few users.

Is this accurate?  It seems so.

---------------------------------------------------------------------------

Oleg Lebedev wrote:
> Jeff,
> I would really appreciate if you could send me that lengthy presentation
> that you've written on pg/other dbs comparison.
> Thanks.
>
> Oleg
>
> -----Original Message-----
> From: Jeff [mailto:threshar@torgo.978.org]
> Sent: Wednesday, October 01, 2003 6:23 AM
> To: David Griffiths
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] Tuning/performance issue...
> Importance: Low
>
>
> On Tue, 30 Sep 2003, David Griffiths wrote:
>
> >
> > This is all part of a "migrate away from Oracle" project. We are
> > looking at 3 databases - MySQL (InnoDB), Postgres and Matisse (object
> > oriented). We have alot of queries like this
> > or worse, and I'm worried that many of them would need to be
> re-written. The
> > developers
> > know SQL, but nothing about tuning, etc.
> >
>
> There's a movement at my company to ditch several commercial db's in
> favor of a free one.  I'm currently the big pg fan around here and I've
> actually written a rather lengthy presentation about pg features, why,
> tuning, etc. but another part was some comparisons to other db's..
>
> I decided so I wouldn't be blinding flaming mysql to give it a whirl and
> loaded it up with the same dataset as pg.  First thing I hit was lack of
> stored procedures.   But I decided to code around that, giving mysql the
> benefit of the doubt.  What I found was interesting.
>
> For 1-2 concurrent
> 'beaters' it screamed. ultra-fast.  But.. If you increase the concurrent
> beaters up to say, 20 Mysql comes to a grinding halt.. Mysql and the
> machine itself become fairly unresponsive.  And if you do cache
> unfriendly
> queries it becomes even worse.   On PG - no problems at all. Scaled fine
> and dandy up.  And with 40 concurrent beaters the machine was still
> responsive.  (The numbers for 20 client was 220 seconds (pg) and 650
> seconds (mysql))
>
> So that is another test to try out - Given your configuration I expect
> you have lots of concurrent activity.
>
> --
> Jeff Trout <jeff@jefftrout.com>
> http://www.jefftrout.com/
> http://www.stuarthamm.net/
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
> your
>       joining column's datatypes do not match
>
> *************************************
>
> This e-mail may contain privileged or confidential material intended for the named recipient only.
> If you are not the named recipient, delete this message and all attachments.
> Unauthorized reviewing, copying, printing, disclosing, or otherwise using information in this e-mail is prohibited.
> We reserve the right to monitor e-mail sent through our network.
>
> *************************************
>
> ---------------------------(end of broadcast)---------------------------
> TIP 8: explain analyze is your friend
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Tuning/performance issue...

От
Rod Taylor
Дата:
On Fri, 2003-10-03 at 21:39, Bruce Momjian wrote:
> I have updated the FAQ to be:
>
>       In comparison to MySQL or leaner database systems, we are
>       faster for multiple users, complex queries, and a read/write query
>       load.  MySQL is faster for SELECT queries done by a few users.
>
> Is this accurate?  It seems so.

May wish to say ... for simple SELECT queries ...

Several left outer joins, subselects and a large number of joins are
regularly performed faster in PostgreSQL due to a more mature optimizer.

But MySQL can pump out SELECT * FROM table WHERE key = value; queries in
a hurry.


I've often wondered if they win on those because they have a lighter
weight parser / optimizer with less "lets try simplifying this query"
steps or if the MYISAM storage mechanism is simply quicker at pulling
data off the disk.


Вложения

Re: Tuning/performance issue...

От
Bruce Momjian
Дата:
Rod Taylor wrote:
-- Start of PGP signed section.
> On Fri, 2003-10-03 at 21:39, Bruce Momjian wrote:
> > I have updated the FAQ to be:
> >
> >       In comparison to MySQL or leaner database systems, we are
> >       faster for multiple users, complex queries, and a read/write query
> >       load.  MySQL is faster for SELECT queries done by a few users.
> >
> > Is this accurate?  It seems so.
>
> May wish to say ... for simple SELECT queries ...

Updated.

> Several left outer joins, subselects and a large number of joins are
> regularly performed faster in PostgreSQL due to a more mature optimizer.
>
> But MySQL can pump out SELECT * FROM table WHERE key = value; queries in
> a hurry.
>
>
> I've often wondered if they win on those because they have a lighter
> weight parser / optimizer with less "lets try simplifying this query"

I think that is part of it.

> steps or if the MYISAM storage mechanism is simply quicker at pulling
> data off the disk.

And their heap is indexed by myisam, right. I know with Ingres that Isam
was usually faster than btree because you didn't have all those leaves
to traverse to get to the data.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Tuning/performance issue...

От
Christopher Browne
Дата:
Centuries ago, Nostradamus foresaw when pgman@candle.pha.pa.us (Bruce Momjian) would write:
> I have updated the FAQ to be:
>
>       In comparison to MySQL or leaner database systems, we are
>       faster for multiple users, complex queries, and a read/write query
>       load.  MySQL is faster for SELECT queries done by a few users.
>
> Is this accurate?  It seems so.

I would think it more accurate if you use the phrase "faster for
simple SELECT queries."

MySQL uses a rule-based optimizer which, when the data fits the rules
well, can pump queries through lickety-split without any appreciable
pause for evaluation (or reflection :-).  That's _quite_ a successful
strategy when users are doing what loosely amounts to evaluating
association tables.

select * from table where key = value;

Which is just like tying a Perl variable to a hash table, and doing
   $value = $TABLE{$key};

In web applications where they wanted something a _little_ more
structured than hash tables, that may 'hit the spot.'

Anything hairier than that gets, of course, hairier.  If you want
something that's TRULY more structured, you may lose a lot of hair
:-).
--
output = reverse("gro.gultn" "@" "enworbbc")
http://www.ntlug.org/~cbbrowne/oses.html
"If you want to talk with some experts about something, go to the bar
where they hang out, buy a round of beers, and they'll surely talk
your ear off, leaving you wiser than before.

If you, a stranger, show up at the bar, walk up to the table, and ask
them to fax you a position paper, they'll tell you to call their
office in the morning and ask for a rate sheet." -- Miguel Cruz

Re: Tuning/performance issue...

От
Tom Lane
Дата:
Rod Taylor <rbt@rbt.ca> writes:
> I've often wondered if they win on those because they have a lighter
> weight parser / optimizer with less "lets try simplifying this query"
> steps or if the MYISAM storage mechanism is simply quicker at pulling
> data off the disk.

Comparing pre-PREPAREd queries would probably tell something about that.

            regards, tom lane

Re: Tuning/performance issue...

От
Jeff
Дата:
On Fri, 3 Oct 2003, Bruce Momjian wrote:

>
> I have updated the FAQ to be:
>
>       In comparison to MySQL or leaner database systems, we are
>       faster for multiple users, complex queries, and a read/write query
>       load.  MySQL is faster for SELECT queries done by a few users.
>
> Is this accurate?  It seems so.
>
>

Another thing I noticed - If you use a dataset that can live in mysql's
query cache / os cache it screams, until it has to hit the disk. then
GRINDING HALT.

It would be nice if someone (I don't have the time now) did a comparison
of say:
selct value where name = XXX; [where xxx varies] with 1,10,20,50
connections

then make progressively more complex queries. And occasionally point out
mysql silly omissions:
select * from myview where id = 1234
[Oh wait! mysql doesn't have views. Ooopsy!]

Wrapping up - PG is not that slow for simple queries either.  It can be
rather zippy - and PREPARE can give HUGE gains - even for simple
statements.   I've often wondered if YACC, etc is a bottleneck (You can
only go as fast as your parser can go).

Hurray for PG!

And I'm giving my PG presentation monday.  I hope to post it tuesday after
I update with comments I receive and remove confidential information.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/