Query running slower than same on Oracle

Поиск
Список
Период
Сортировка
От Sailer, Denis (YBUSA-CDR)
Тема Query running slower than same on Oracle
Дата
Msg-id 58D6EF5727036048BC9E08ECB711A42092ECD0@crmsmail.corp.ybusa.net
обсуждение исходный текст
Ответы Re: Query running slower than same on Oracle
Re: Query running slower than same on Oracle
Список pgsql-performance
We are evaluating PostgreSQL for a typical data warehouse application.  I
have 3 tables below that are part of a Star schema design.  The query listed
below runs in 16 seconds on Oracle 9.2 and 3+ minutes on PostgreSQL 7.3.3
Here are the details.

I'm wondering what else can be done to tune this type of query.  Is 3
minutes reasonable given the amount of data that is loaded into the 3
tables?  Is there anyone else who has made comparisons between Oracle and
PostgreSQL?

----------------------------------------------------------------------------
------------------------------------------------------------------------
Oracle 9.2 is running on a windows/2000 server, 600MHz PIII, 512MB ram
    Shared Pool    48MB
    Buffer Cache    98MB
    Large Pool      8MB
    Java Pool    32MB
            =========
    Total SGA    186MB

----------------------------------------------------------------------------
------------------------------------------------------------------------
PostgreSQL is running on Redhat Linux 7.2, 733MHz PIII processor, 383MB ram.
    shared_buffers = 12384        (96 MB)
    sort_mem = 16384

----------------------------------------------------------------------------
------------------------------------------------------------------------

explain analyze
select fiscalyearquarter, description, sum(amount_quantity)
from time t, revenue r, statistic s
Where t.fiscalyear = 2002
and r.timekey = t.timekey
and r.statisticskey = s.statisticskey
group by  fiscalyearquarter, description;

                                                                     QUERY
PLAN
----------------------------------------------------------------------------
------------------------------------------------------------------------
 Aggregate  (cost=124685.74..127078.87 rows=23931 width=48) (actual
time=170682.53..189640.85 rows=8 loops=1)
   ->  Group  (cost=124685.74..126480.59 rows=239313 width=48) (actual
time=169508.49..185478.90 rows=1082454 loops=1)
         ->  Sort  (cost=124685.74..125284.02 rows=239313 width=48) (actual
time=169508.47..171853.03 rows=1082454 loops=1)
               Sort Key: t.fiscalyearquarter, s.description
               ->  Hash Join  (cost=6.46..94784.90 rows=239313 width=48)
(actual time=140.20..47685.46 rows=1082454 loops=1)
                     Hash Cond: ("outer".statisticskey =
"inner".statisticskey)
                     ->  Hash Join  (cost=5.43..90595.90 rows=239313
width=32) (actual time=139.96..39672.76 rows=1082454 loops=1)
                           Hash Cond: ("outer".timekey = "inner".timekey)
                           ->  Seq Scan on revenue r  (cost=0.00..68454.04
rows=3829004 width=17) (actual time=0.01..26336.95 rows=3829004 loops=1)
                           ->  Hash  (cost=5.40..5.40 rows=12 width=15)
(actual time=0.79..0.79 rows=0 loops=1)
                                 ->  Seq Scan on "time" t  (cost=0.00..5.40
rows=12 width=15) (actual time=0.36..0.75 rows=12 loops=1)
                                       Filter: (fiscalyear = 2002::numeric)
                     ->  Hash  (cost=1.02..1.02 rows=2 width=16) (actual
time=0.04..0.04 rows=0 loops=1)
                           ->  Seq Scan on statistic s  (cost=0.00..1.02
rows=2 width=16) (actual time=0.02..0.03 rows=2 loops=1)
 Total runtime: 195409.79 msec


This gives you an idea of the size of each table in the query
----------------------------------------------------------------------------
------------------------------------------------------------------------

pubnet=# vacuum analyze verbose revenue;
INFO:  --Relation dw.revenue--
INFO:  Pages 30164: Changed 0, Empty 0; Tup 3829004: Vac 0, Keep 0, UnUsed
17.
        Total CPU 1.87s/0.73u sec elapsed 9.97 sec.
INFO:  Analyzing dw.revenue
VACUUM
pubnet=# vacuum analyze verbose statistic;
INFO:  --Relation dw.statistic--
INFO:  Pages 1: Changed 0, Empty 0; Tup 2: Vac 0, Keep 0, UnUsed 1.
        Total CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  Analyzing dw.statistic
VACUUM
pubnet=# vacuum analyze verbose time;
INFO:  --Relation dw.time--
INFO:  Pages 3: Changed 0, Empty 0; Tup 192: Vac 0, Keep 0, UnUsed 33.
        Total CPU 0.00s/0.00u sec elapsed 0.02 sec.
INFO:  Analyzing dw.time
VACUUM
pubnet=#


I tried to disable the use of hash join to see what might happen.  This
causes the optimizer to use a merge join.  The timings are worse.

Here is the plan for that


QUERY PLAN

----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------
 Aggregate  (cost=665570.44..667963.57 rows=23931 width=48) (actual
time=362121.97..381081.18 rows=8 loops=1)
   ->  Group  (cost=665570.44..667365.29 rows=239313 width=48) (actual
time=360948.51..376904.14 rows=1082454 loops=1)
         ->  Sort  (cost=665570.44..666168.72 rows=239313 width=48) (actual
time=360948.48..363285.85 rows=1082454 loops=1)
               Sort Key: t.fiscalyearquarter, s.description
               ->  Merge Join  (cost=631481.61..635669.60 rows=239313
width=48) (actual time=263257.77..276625.27 rows=1082454 loops=1)
                     Merge Cond: ("outer".statisticskey =
"inner".statisticskey)
                     ->  Sort  (cost=631480.58..632078.86 rows=239313
width=32) (actual time=260561.38..264151.04 rows=1082454 loops=1)
                           Sort Key: r.statisticskey
                           ->  Merge Join  (cost=587963.25..610099.74
rows=239313 width=32) (actual time=217380.88..231958.36 rows=1082454
loops=1)
                                 Merge Cond: ("outer".timekey =
"inner".timekey)
                                 ->  Sort  (cost=5.62..5.65 rows=12
width=15) (actual time=14.90..14.92 rows=12 loops=1)
                                       Sort Key: t.timekey
                                       ->  Seq Scan on "time" t
(cost=0.00..5.40 rows=12 width=15) (actual time=13.47..14.83 rows=12
loops=1)
                                             Filter: (fiscalyear =
2002::numeric)
                                 ->  Sort  (cost=587957.63..597530.14
rows=3829004 width=17) (actual time=214776.92..224634.94 rows=1455997
loops=1)
                                       Sort Key: r.timekey
                                       ->  Seq Scan on revenue r
(cost=0.00..68454.04 rows=3829004 width=17) (actual time=1.33..31014.95
rows=3829004 loops=1)
                     ->  Sort  (cost=1.03..1.03 rows=2 width=16) (actual
time=2696.35..3765.93 rows=541228 loops=1)
                           Sort Key: s.statisticskey
                           ->  Seq Scan on statistic s  (cost=0.00..1.02
rows=2 width=16) (actual time=19.50..19.52 rows=2 loops=1)
 Total runtime: 385939.85 msec


The Query plan in Oracle looks like this...
----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------
SORT GROUP BY
    HASH JOIN
        MERGE JOIN CARTESIAN
            TABLE ACCESS FULL            DBA_ADMIN
STATISTIC
            BUFFER SORT
                TABLE ACCESS FULL        DBA_ADMIN
TIME
        TABLE ACCESS FULL                DBA_ADMIN
REVENUE


В списке pgsql-performance по дате отправления:

Предыдущее
От: "Fernando Papa"
Дата:
Сообщение: Similar querys, better execution time on worst execution plan
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Query running slower than same on Oracle