Обсуждение: Seemingly identical queries run at different speeds
Hello,
I have three queries that are essentially identical. Two of them run in
fractions of a second; one of them takes longer than 15 minutes to
return. (Not sure how long it totally takes, that has been the limit of
my patience.)
The only difference between these queries is the use of table aliases in
the sql query. What is happening in postgres that this makes such a
difference?
SQL queries below.
Thanks,
VanL
FAST (0.017 second):
select
mm_batch.name as batch_name,
mm_domain.name as domain_name,
mm_management_unit.name as management_unit_name,
mm_customer.firstname as customer_name,
mm_legacy_account.username as old_username,
mm_target_account.username as new_username
from
mm_batch,
mm_domain,
mm_management_unit,
mm_customer,
mm_legacy_account,
mm_target_account
where
mm_domain.bid = mm_batch.id
and mm_domain.mid = mm_management_unit.id
and mm_domain.cid = mm_customer.id
and mm_domain.lid = mm_legacy_account.id
and mm_domain.tid = mm_target_account.id
and mm_domain.name = 'example.com';
========================================================
FAST (0.016 second):
select
B.name as batch_name,
D.name as domain_name,
M.name as management_unit_name,
C.firstname as customer_name,
LA.username as old_username,
TA.username as new_username
from
mm_batch B,
mm_domain D,
mm_management_unit M,
mm_customer C,
mm_legacy_account LA,
mm_target_account TA
where
D.bid = B.id
and D.mid = M.id
and D.cid = C.id
and D.lid = LA.id
and D.tid = TA.id
and D.name = 'example.com';
=========================================================
SLOW ( > 15 minutes):
select
mm_batch.name as batch_name,
mm_domain.name as domain_name,
mm_management_unit.name as management_unit_name,
mm_customer.firstname as customer_name,
mm_legacy_account.username as old_username,
mm_target_account.username as new_username
from
mm_batch B,
mm_domain D,
mm_management_unit M,
mm_customer C,
mm_legacy_account LA,
mm_target_account TA
where
mm_domain.bid = mm_batch.id
and mm_domain.mid = mm_management_unit.id
and mm_domain.cid = mm_customer.id
and mm_domain.lid = mm_legacy_account.id
and mm_domain.tid = mm_target_account.id
and mm_domain.name = 'example.com';
On Tue, 2003-07-22 at 19:27, VanL wrote:
> I have three queries that are essentially identical. Two of them run in
> fractions of a second; one of them takes longer than 15 minutes to
> return. (Not sure how long it totally takes, that has been the limit of
> my patience.)
>
> The only difference between these queries is the use of table aliases in
> the sql query. What is happening in postgres that this makes such a
> difference?
You're getting bit by a 'feature' that is supposed to make life easier.
PostgreSQL automatically adds tables you reference to the FROM clause if
they're not listed there already, it then sends a NOTICE saying it did
so. This is supposed to be helpful, here it certainly is not.
What's happening in the below is you reference mm_batch, mm_domain, etc.
which are _different_ than B, D, etc. because aliasing changes what
query processor refers to them as (allowing stuff like self-joins).
So Pg is appending them to the FROM clause, which now looks like this:
FROM mm_batch B, mm_domain D, mm_management_unit M, mm_customer C,
mm_legacy_account LA, mm_target_account TA, mm_batch, mm_domain,
mm_management_unit, mm_customer, mm_legacy_account,
mm_target_account
Yikes. It's now generating an absolutely huge cartesian product between
the result set you want and all those extra tables.
I believe in the upcoming 7.4.x release this won't happen unless you
enable ADD_MISSING_FROM in your conf. Something I certainly won't be
enabling due to this type of situation.
> SLOW ( > 15 minutes):
> select
> mm_batch.name as batch_name,
> mm_domain.name as domain_name,
> mm_management_unit.name as management_unit_name,
> mm_customer.firstname as customer_name,
> mm_legacy_account.username as old_username,
> mm_target_account.username as new_username
> from
> mm_batch B,
> mm_domain D,
> mm_management_unit M,
> mm_customer C,
> mm_legacy_account LA,
> mm_target_account TA
> where
> mm_domain.bid = mm_batch.id
> and mm_domain.mid = mm_management_unit.id
> and mm_domain.cid = mm_customer.id
> and mm_domain.lid = mm_legacy_account.id
> and mm_domain.tid = mm_target_account.id
> and mm_domain.name = 'example.com';
On Tue, Jul 22, 2003 at 05:27:37PM -0600, VanL wrote: > I have three queries that are essentially identical. Two of them run in > fractions of a second; one of them takes longer than 15 minutes to > return. (Not sure how long it totally takes, that has been the limit of > my patience.) > > The only difference between these queries is the use of table aliases in > the sql query. What is happening in postgres that this makes such a > difference? I think the third query is adding items to the FROM clause that you probably don't want. The result is a cartesian product, which is going to take eons to materialize (i.e. give results back). See: http://www14.us.postgresql.org/docs/7.3/static/sql-select.html#R2-SQL-SELECT-4 -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Estoy de acuerdo contigo en que la verdad absoluta no existe... El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)