Horribly slow query/ sequential scan

Поиск
Список
Период
Сортировка
От Gregory S. Williamson
Тема Horribly slow query/ sequential scan
Дата
Msg-id 71E37EF6B7DCC1499CEA0316A256832802B3EB8A@loki.wc.globexplorer.net
обсуждение исходный текст
Ответы Re: Horribly slow query/ sequential scan  (db@zigo.dhs.org)
Список pgsql-performance
This is a query migrated from postgres. In postgres it runs about 10,000 times *slower* than on informix on somewhat
newerhardware. The problem is entirely due to the planner. This  PostgreSQL 8.1.4 on linux, 2 gigs of ram. 

The table:
          Table "reporting.bill_rpt_work"
    Column     |         Type          | Modifiers
---------------+-----------------------+-----------
 report_id     | integer               |
 client_id     | character varying(10) |
 contract_id   | integer               | not null
 rate          | numeric               | not null
 appid         | character varying(10) | not null
 userid        | text                  | not null
 collection_id | integer               | not null
 client_name   | character varying(60) |
 use_sius      | integer               | not null
 is_subscribed | integer               | not null
 hits          | numeric               | not null
 sius          | numeric               | not null
 total_amnt    | numeric               | not null
 royalty_total | numeric               |
Indexes:
    "billrptw_ndx" UNIQUE, btree (report_id, client_id, contract_id, rate, appid, userid, collection_id)
    "billrpt_cntrct_ndx" btree (report_id, contract_id, client_id)
    "billrpt_collid_ndx" btree (report_id, collection_id, client_id, contract_id)
Foreign-key constraints:
    "$1" FOREIGN KEY (report_id) REFERENCES billing_reports(report_id)
    "$2" FOREIGN KEY (client_id) REFERENCES "work".clients(client_id)


The query:
explain analyze select
w.appid,w.rate,w.is_subscribed,sum(w.hits) AS Hits ,sum(w.sius) AS IUs,
sum(w.total_amnt) AS Total,sum(w.hits) * w.rate AS ByHits,
sum(w.sius) * w.rate AS BYIUS
from bill_rpt_work w, billing_reports b
where w.report_id in
(select b.report_id from billing_reports where b.report_s_date = '2006-09-30')
and (w.client_id = '227400001' or w.client_id = '2274000010')
group by 1,2,3
order by 1,2,3;
                                                                  QUERY PLAN
----------------------------------------------------------------------------------------------------------------
------------------------------
 GroupAggregate  (cost=298061335.44..298259321.62 rows=26 width=58) (actual time=372213.673..372213.689 rows=2 loops=1)
   ->  Sort  (cost=298061335.44..298083333.83 rows=8799354 width=58) (actual time=372213.489..372213.503 rows=37
loops=1)
         Sort Key: w.appid, w.rate, w.is_subscribed
         ->  Nested Loop  (cost=0.00..296121313.45 rows=8799354 width=58) (actual time=286628.486..372213.053 rows=37
loops=1)
               Join Filter: (subplan)
               ->  Seq Scan on bill_rpt_work w  (cost=0.00..85703.20 rows=11238 width=62) (actual time=1.239..1736.746
rows=61020loops=1) 
                     Filter: (((client_id)::text = '227400001'::text) OR ((client_id)::text = '2274000010'::text))
               ->  Seq Scan on billing_reports b  (cost=0.00..29.66 rows=1566 width=8) (actual time=0.001..0.879
rows=1566loops=61020) 
               SubPlan
                 ->  Result  (cost=0.00..29.66 rows=1566 width=0) (actual time=0.000..0.002 rows=1 loops=95557320)
                       One-Time Filter: ($1 = '2006-09-30'::date)
                       ->  Seq Scan on billing_reports  (cost=0.00..29.66 rows=1566 width=0) (actual time=0.001..0.863
rows=1565loops=61020) 
 Total runtime: 372214.085 ms


Informix uses report id/client id as an index, thus eliminating a huge number of rows. The table has 2280545 rows
currently;slightly fewer when the above analyze was run. Informix has about 5 times as much data. 

select count(*) from bill_rpt_work where report_id in (select report_id from billing_reports where report_s_date =
'2006-09-30')and (client_id = '227400001' or client_id = '2274000010'); 
 count
-------
    37
(1 row)

So scanning everything seems particularly senseless.

I had some success adding client id and report id to the initial select list, but that causes all sorts of problems in
callingprocedures that expect different data grouping. 

Any suggestion would be welcome because this is a horrible show stopper.

Thanks,

Greg Williamson
DBA
GlobeXplorer LLC




В списке pgsql-performance по дате отправления:

Предыдущее
От: "Merlin Moncure"
Дата:
Сообщение: Re: table partioning performance
Следующее
От: "Jim C. Nasby"
Дата:
Сообщение: Re: High update activity, PostgreSQL vs BigDBMS