Обсуждение: Optimize SQL

Поиск
Список
Период
Сортировка

Optimize SQL

От
Pallav Kalva
Дата:
Hi,

   Is there anyway we can optimize this sql ? it is doing full table
scan on listing and address table . Postgres version 8.0.2

Thanks!
Pallav.


explain analyze
select listing0_.listingid as col_0_0_,
getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) as col_1_0_
from listing.listing listing0_
left outer join listing.address listingadd1_
on listing0_.fkbestaddressid=listingadd1_.addressid
left outer join listing.addressvaluation addressval2_
on listingadd1_.addressid=addressval2_.fkaddressid
where listing0_.lastupdate>'2006-09-15 08:31:26.927'
and listing0_.lastupdate<=current_timestamp
or addressval2_.createdate>'2006-09-15 08:31:26.927' and
addressval2_.createdate<=current_timestamp
group by listing0_.listingid , listing0_.lastupdate
order by getmaxdate(listing0_.lastupdate, max(addressval2_.createdate))
asc limit 10;


Limit  (cost=2399501.49..2399501.51 rows=10 width=20) (actual time=414298.076..414298.174 rows=10 loops=1)
   ->  Sort  (cost=2399501.49..2410707.32 rows=4482333 width=20) (actual time=414298.068..414298.098 rows=10 loops=1)
         Sort Key: getmaxdate(listing0_.lastupdate, max(addressval2_.createdate))
         ->  GroupAggregate  (cost=1784490.47..1851725.47 rows=4482333 width=20) (actual time=414212.926..414284.927
rows=2559loops=1) 
               ->  Sort  (cost=1784490.47..1795696.31 rows=4482333 width=20) (actual time=414174.678..414183.536
rows=2563loops=1) 
                     Sort Key: listing0_.listingid, listing0_.lastupdate
                     ->  Merge Right Join  (cost=1113947.32..1236714.45 rows=4482333 width=20) (actual
time=273257.256..414163.920rows=2563 loops=1) 
                           Merge Cond: ("outer".fkaddressid = "inner".addressid)
                           Filter: ((("inner".lastupdate > '2006-09-15 08:31:26.927'::timestamp without time zone) AND
("inner".lastupdate<= ('now'::text)::timestamp(6) with time zone)) OR (("outer".createdate > '2006-09-15
08:31:26.927'::timestampwithout time zone) AND ("outer".createdate <= ('now'::text)::timestamp(6) with time zone))) 
                           ->  Index Scan using idx_addressvaluation_fkaddressid on addressvaluation addressval2_
(cost=0.00..79769.55rows=947056 width=12) (actual time=0.120..108240.633 rows=960834 loops=1) 
                           ->  Sort  (cost=1113947.32..1125153.15 rows=4482333 width=16) (actual
time=256884.646..275823.217rows=5669719 loops=1) 
                                 Sort Key: listingadd1_.addressid
                                 ->  Hash Left Join  (cost=228115.38..570557.39 rows=4482333 width=16) (actual
time=93874.356..205054.946rows=4490963 loops=1) 
                                       Hash Cond: ("outer".fkbestaddressid = "inner".addressid)
                                       ->  Seq Scan on listing listing0_  (cost=0.00..112111.33 rows=4482333 width=16)
(actualtime=0.026..25398.685 rows=4490963 loops=1) 
                                       ->  Hash  (cost=183333.70..183333.70 rows=6990270 width=4) (actual
time=93873.659..93873.659rows=0 loops=1) 
                                             ->  Seq Scan on address listingadd1_  (cost=0.00..183333.70 rows=6990270
width=4)(actual time=13.256..69441.056 rows=6990606 loops=1) 


Re: Optimize SQL

От
Tom Lane
Дата:
Pallav Kalva <pkalva@livedatagroup.com> writes:
> select listing0_.listingid as col_0_0_,
> getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) as col_1_0_
> from listing.listing listing0_
> left outer join listing.address listingadd1_
> on listing0_.fkbestaddressid=listingadd1_.addressid
> left outer join listing.addressvaluation addressval2_
> on listingadd1_.addressid=addressval2_.fkaddressid
> where listing0_.lastupdate>'2006-09-15 08:31:26.927'
> and listing0_.lastupdate<=current_timestamp
> or addressval2_.createdate>'2006-09-15 08:31:26.927' and
> addressval2_.createdate<=current_timestamp
> group by listing0_.listingid , listing0_.lastupdate
> order by getmaxdate(listing0_.lastupdate, max(addressval2_.createdate))
> asc limit 10;

If that WHERE logic is actually what you need, then getting this query
to run quickly seems pretty hopeless.  The database must form the full
outer join result: it cannot discard any listing0_ rows, even if they
have lastupdate outside the given range, because they might join to
addressval2_ rows within the given createdate range.  And conversely
it can't discard any addressval2_ rows early.  Is there any chance
that you wanted AND not OR there?

One thing that might help a bit is to change the join order:

from listing.listing listing0_
left outer join listing.addressvaluation addressval2_
on listing0_.fkbestaddressid=addressval2_.fkaddressid
left outer join listing.address listingadd1_
on listing0_.fkbestaddressid=listingadd1_.addressid

so that at least the WHERE clause can be applied before having joined to
listingadd1_.  The semantics of your ON clauses are probably wrong anyway
--- did you think twice about what happens if there's no matching
listingadd1_ entry?

            regards, tom lane

Re: Optimize SQL

От
Arjen van der Meijden
Дата:
On 15-9-2006 17:53 Tom Lane wrote:
> If that WHERE logic is actually what you need, then getting this query
> to run quickly seems pretty hopeless.  The database must form the full
> outer join result: it cannot discard any listing0_ rows, even if they
> have lastupdate outside the given range, because they might join to
> addressval2_ rows within the given createdate range.  And conversely
> it can't discard any addressval2_ rows early.  Is there any chance
> that you wanted AND not OR there?

Couldn't it also help to do something like this?

SELECT ..., (SELECT MAX(createdate) FROM addressval ...)
FROM listing l
   LEFT JOIN address ...
WHERE l.id IN (SELECT id FROM listing WHERE lastupdate ...
       UNION
         SELECT id FROM listing JOIN addressval a ON ... WHERE
a.createdate ...)


Its not pretty, but looking at the explain only a small amount of
records match both clauses. So this should allow the use of indexes for
both the createdate-clause and the lastupdate-clause.

Best regards,

Arjen

Re: Optimize SQL

От
"Mikael Carneholm"
Дата:
That query is generated by hibernate, right?


-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Pallav
Kalva
Sent: den 15 september 2006 17:10
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Optimize SQL

Hi,

   Is there anyway we can optimize this sql ? it is doing full table
scan on listing and address table . Postgres version 8.0.2

Thanks!
Pallav.


explain analyze
select listing0_.listingid as col_0_0_, getmaxdate(listing0_.lastupdate,
max(addressval2_.createdate)) as col_1_0_ from listing.listing listing0_
left outer join listing.address listingadd1_ on
listing0_.fkbestaddressid=listingadd1_.addressid
left outer join listing.addressvaluation addressval2_ on
listingadd1_.addressid=addressval2_.fkaddressid
where listing0_.lastupdate>'2006-09-15 08:31:26.927'
and listing0_.lastupdate<=current_timestamp
or addressval2_.createdate>'2006-09-15 08:31:26.927' and
addressval2_.createdate<=current_timestamp
group by listing0_.listingid , listing0_.lastupdate order by
getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) asc limit
10;


Limit  (cost=2399501.49..2399501.51 rows=10 width=20) (actual
time=414298.076..414298.174 rows=10 loops=1)
   ->  Sort  (cost=2399501.49..2410707.32 rows=4482333 width=20) (actual
time=414298.068..414298.098 rows=10 loops=1)
         Sort Key: getmaxdate(listing0_.lastupdate,
max(addressval2_.createdate))
         ->  GroupAggregate  (cost=1784490.47..1851725.47 rows=4482333
width=20) (actual time=414212.926..414284.927 rows=2559 loops=1)
               ->  Sort  (cost=1784490.47..1795696.31 rows=4482333
width=20) (actual time=414174.678..414183.536 rows=2563 loops=1)
                     Sort Key: listing0_.listingid, listing0_.lastupdate
                     ->  Merge Right Join  (cost=1113947.32..1236714.45
rows=4482333 width=20) (actual time=273257.256..414163.920 rows=2563
loops=1)
                           Merge Cond: ("outer".fkaddressid =
"inner".addressid)
                           Filter: ((("inner".lastupdate > '2006-09-15
08:31:26.927'::timestamp without time zone) AND ("inner".lastupdate <=
('now'::text)::timestamp(6) with time zone)) OR (("outer".createdate >
'2006-09-15 08:31:26.927'::timestamp without time zone) AND
("outer".createdate <= ('now'::text)::timestamp(6) with time zone)))
                           ->  Index Scan using
idx_addressvaluation_fkaddressid on addressvaluation addressval2_
(cost=0.00..79769.55 rows=947056 width=12) (actual
time=0.120..108240.633 rows=960834 loops=1)
                           ->  Sort  (cost=1113947.32..1125153.15
rows=4482333 width=16) (actual time=256884.646..275823.217 rows=5669719
loops=1)
                                 Sort Key: listingadd1_.addressid
                                 ->  Hash Left Join
(cost=228115.38..570557.39 rows=4482333 width=16) (actual
time=93874.356..205054.946 rows=4490963 loops=1)
                                       Hash Cond:
("outer".fkbestaddressid = "inner".addressid)
                                       ->  Seq Scan on listing listing0_
(cost=0.00..112111.33 rows=4482333 width=16) (actual
time=0.026..25398.685 rows=4490963 loops=1)
                                       ->  Hash
(cost=183333.70..183333.70 rows=6990270 width=4) (actual
time=93873.659..93873.659 rows=0 loops=1)
                                             ->  Seq Scan on address
listingadd1_  (cost=0.00..183333.70 rows=6990270 width=4) (actual
time=13.256..69441.056 rows=6990606 loops=1)


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@postgresql.org so that your
       message can get through to the mailing list cleanly