Обсуждение: Performance problem with joins

Поиск
Список
Период
Сортировка

Performance problem with joins

От
fardeen memon
Дата:
Hi

i have a severe performance problem with one of my
views which has 6 to 8 joins .. any help will be
appreciated..
the view is:

CREATE OR REPLACE VIEW thsn.trade_view AS
SELECT tra.tra_id, tra.per_id, tra.fir_id,
tra.tra_dcn, tra.tra_startdate::date AS tra_startdate,
tra.tra_enddate::date AS tra_enddate,
tra.tra_highprice, tra.tra_lowprice, tra.tra_shares,
tra.tra_marketvalue, tra.tra_commonsharesheld,
tra.tra_directsharesheld, tra.tra_indirectsharesheld,
tra.fun_id, tra.tra_amended, tra.tra_ownership,
tra.tra_touchdate::date AS tra_touchdate,
tra.tra_cdate, tra.tra_udate, tra.tra_relevant,
tra.tra_type, tra.tra_date::date AS tra_date,
per.per_fullname, fir.fir_name, fir.bra_id,
cac90.pc_perf AS tra_performance90, incac90.pc_perf AS
tra_indexperformance90, cac180.pc_perf AS
tra_performance180, incac180.pc_perf AS
tra_indexperformance180, cac270.pc_perf AS
tra_performance270, incac270.pc_perf AS
tra_indexperformance270, kurl.kur_marketcap AS
tra_marketkap, kurl.kur_close AS tra_close,
thsn.per_letztebewertungkauf(per.per_id, fir.fir_id)
AS per_punktekauf,
thsn.per_letztebewertungverkauf(per.per_id,
fir.fir_id) AS per_punkteverkauf, fun.fun_thid,
fun.fun_name, wp.wp_symbol
FROM thsn.trade tra
JOIN thsn.person per ON tra.per_id = per.per_id
JOIN thsn.firma fir ON tra.fir_id = fir.fir_id
LEFT JOIN thsn.kurs_latest kurl ON ('U'::text ||
fir.fir_cusip::text) =kurl.fir_cusip
LEFT JOIN thsn.perfcache90 cac90 ON tra.tra_id =
cac90.tra_id
LEFT JOIN thsn.indexperfcache90 incac90 ON tra.tra_id
= incac90.tra_id
LEFT JOIN thsn.perfcache180 cac180 ON tra.tra_id =
cac180.tra_id
LEFT JOIN thsn.indexperfcache180 incac180 ON
tra.tra_id = incac180.tra_id
LEFT JOIN thsn.perfcache270 cac270 ON tra.tra_id =
cac270.tra_id
LEFT JOIN thsn.indexperfcache270 incac270 ON
tra.tra_id = incac270.tra_id
LEFT JOIN thsn.funktion fun ON tra.fun_id = fun.fun_id
LEFT JOIN thsn.wertpapier wp ON fir.wp_id = wp.wp_id;


and now if i query this view with this explain query :

explain select * from thsn.trade_view tra where
tra_date>'2006-05-29'

the output:

"Merge Right Join (cost=304605.98..319519.02
rows=324367 width=370)"
" Merge Cond: ("outer".wp_id = "inner".wp_id)"
" -> Index Scan using pk_wertpapier on wertpapier wp
(cost=0.00..1134.06 rows=30651 width=12)"
" -> Sort (cost=304605.98..305416.90 rows=324367
width=370)"
" Sort Key: fir.wp_id"
" -> Hash Left Join (cost=102943.82..274914.62
rows=324367 width=370)"
" Hash Cond: ("outer".fun_id = "inner".fun_id)"
" -> Hash Left Join (cost=102942.07..271019.38
rows=324367 width=340)"
" Hash Cond: ("outer".tra_id = "inner".tra_id)"
" -> Hash Left Join (cost=71679.05..216585.25
rows=324367 width=308)"
" Hash Cond: ("outer".tra_id = "inner".tra_id)"
" -> Hash Left Join (cost=53148.50..189791.47
rows=324367 width=297)"
" Hash Cond: ("outer".tra_id = "inner".tra_id)"
" -> Hash Left Join (cost=25994.49..148209.39
rows=324367 width=275)"
" Hash Cond: (('U'::text || ("outer".fir_cusip)::text)
= ("inner".fir_cusip)::text)"
" -> Hash Join (cost=24702.75..133134.22 rows=324367
width=264)"
" Hash Cond: ("outer".per_id = "inner".per_id)"
" -> Hash Join (cost=1450.91..99340.45 rows=324367
width=237)"
" Hash Cond: ("outer".fir_id = "inner".fir_id)"
" -> Seq Scan on trade tra (cost=0.00..88158.53
rows=324367 width=181)"
" Filter: ((tra_date)::date > '2006-05-29'::date)"
"-> Hash (cost=1374.53..1374.53 rows=30553 width=56)"
"-> Seq Scan on firma fir (cost=0.00..1374.53
rows=30553 width=56)"
"-> Hash (cost=22629.87..22629.87 rows=248787
width=27)"
"-> Seq Scan on person per (cost=0.00..22629.87
rows=248787 width=27)"
"-> Hash (cost=1232.59..1232.59 rows=23659 width=35)"
"-> Seq Scan on kurs_latest kurl (cost=0.00..1232.59
rows=23659 width=35)"
"-> Hash (cost=17244.44..17244.44 rows=814044
width=19)"
"-> Seq Scan on perfcache90 cac90 (cost=0.00..17244.44
rows=814044 width=19)"
" -> Hash (cost=6994.97..6994.97 rows=351797
width=19)"
" -> Seq Scan on indexperfcache90 incac90
(cost=0.00..6994.97 rows=351797 width=19)"
" -> Hash (cost=16590.44..16590.44 rows=776044
width=19)"
" -> Seq Scan on perfcache180 cac180
(cost=0.00..16590.44 rows=776044 width=19)"
" -> Hash (cost=6704.00..6704.00 rows=336800
width=18)"
" -> Seq Scan on indexperfcache180 incac180
(cost=0.00..6704.00 rows=336800 width=18)"
" -> Hash (cost=14755.09..14755.09 rows=695309
width=19)"
" -> Seq Scan on perfcache270 cac270
(cost=0.00..14755.09 rows=695309 width=19)"
" -> Hash (cost=6413.93..6413.93 rows=323893
width=19)"
" -> Seq Scan on indexperfcache270 incac270
(cost=0.00..6413.93 rows=323893 width=19)"
" -> Hash (cost=1.60..1.60 rows=60 width=34)"
" -> Seq Scan on funktion fun (cost=0.00..1.60 rows=60
width=34)"


and without the joins if i run a explain on this
query:

EXPLAIN SELECT tra.tra_id, tra.per_id, tra.fir_id,
tra.tra_dcn, tra.tra_startdate::date AS tra_startdate,
tra.tra_enddate::date AS tra_enddate,
tra.tra_highprice, tra.tra_lowprice, tra.tra_shares,
tra.tra_marketvalue, tra.tra_commonsharesheld,
tra.tra_directsharesheld, tra.tra_indirectsharesheld,
tra.fun_id, tra.tra_amended, tra.tra_ownership,
tra.tra_touchdate::date AS tra_touchdate,
tra.tra_cdate, tra.tra_udate, tra.tra_relevant,
tra.tra_type, tra.tra_date::date AS tra_date,
per.per_fullname, fir.fir_name, fir.bra_id,
cac90.pc_perf AS tra_performance90, incac90.pc_perf AS
tra_indexperformance90, cac180.pc_perf AS
tra_performance180, incac180.pc_perf AS
tra_indexperformance180, cac270.pc_perf AS
tra_performance270, incac270.pc_perf AS
tra_indexperformance270, kurl.kur_marketcap AS
tra_marketkap, kurl.kur_close AS tra_close,
thsn.per_letztebewertungkauf(per.per_id, fir.fir_id)
AS per_punktekauf,
thsn.per_letztebewertungverkauf(per.per_id,
fir.fir_id) AS per_punkteverkauf, fun.fun_thid,
fun.fun_name, wp.wp_symbol

FROM thsn.trade tra , thsn.person per, thsn.firma
fir,thsn.kurs_latest kurl , thsn.perfcache90 cac90,
thsn.indexperfcache90 incac90 , thsn.perfcache180
cac180 ,thsn.indexperfcache180 incac180
,thsn.perfcache270 cac270, thsn.indexperfcache270
incac270 , thsn.funktion fun, thsn.wertpapier wp

where tra_date>'2006-06-30' and tra.per_id =
per.per_id and tra.fir_id = fir.fir_id and ('U'::text
|| fir.fir_cusip::text) = kurl.fir_cusip::text and
tra.tra_id = cac90.tra_id and tra.tra_id =
incac90.tra_id and tra.tra_id = cac180.tra_id and
tra.tra_id = incac180.tra_id and tra.tra_id =
cac270.tra_id and tra.tra_id = incac270.tra_id and
tra.fun_id = fun.fun_id and fir.wp_id = wp.wp_id

the output:

"Nested Loop (cost=64179.28..90645.20 rows=394
width=370)"
" -> Nested Loop (cost=64179.28..89072.83 rows=394
width=343)"
" -> Nested Loop (cost=64179.28..87183.66 rows=471
width=372)"
" -> Nested Loop (cost=64179.28..81962.24 rows=1304
width=353)"
" -> Nested Loop (cost=64179.28..74632.57 rows=1825
width=334)"
" -> Merge Join (cost=64179.28..65424.31 rows=2289
width=315)"
" Merge Cond: ("outer".wp_id = "inner".wp_id)"
" -> Index Scan using pk_wertpapier on wertpapier wp
(cost=0.00..1134.06 rows=30651 width=12)"
" -> Sort (cost=64179.28..64185.15 rows=2349
width=315)"
" Sort Key: fir.wp_id"
" -> Seq Scan on indexperfcache180 incac180
(cost=0.00..6704.00 rows=336800 width=18)"
" -> Hash (cost=54717.99..54717.99 rows=9690
width=267)"
" -> Merge Join (cost=42275.34..54717.99 rows=9690
width=267)"
" Merge Cond: ("outer".tra_id = "inner".tra_id)"
" -> Index Scan using pk_indexperfcache270 on
indexperfcache270 incac270 (cost=0.00..11393.83
rows=323893 width=19)"
" -> Sort (cost=42275.34..42348.12 rows=29114
width=248)"
" Sort Key: tra.tra_id"
" -> Hash Join (cost=4224.87..40116.62 rows=29114
width=248)"
" Hash Cond: ("outer".fir_id = "inner".fir_id)"
" -> Bitmap Heap Scan on trade tra
(cost=183.96..35201.91 rows=29133 width=181)"
" Recheck Cond: (tra_date > '2006-06-30
00:00:00'::timestamp without time zone)"
" -> Bitmap Index Scan on trade_date_index
(cost=0.00..183.96 rows=29133 width=0)"
" Index Cond: (tra_date > '2006-06-30
00:00:00'::timestamp without time zone)"
" -> Hash (cost=3964.57..3964.57 rows=30533 width=67)"
" -> Hash Join (cost=1291.74..3964.57 rows=30533
width=67)"
" Hash Cond: (('U'::text || ("outer".fir_cusip)::text)
= ("inner".fir_cusip)::text)"
" -> Seq Scan on firma fir (cost=0.00..1374.53
rows=30553 width=56)"
" -> Hash (cost=1232.59..1232.59 rows=23659 width=35)"
" -> Seq Scan on kurs_latest kurl (cost=0.00..1232.59
rows=23659 width=35)"
" -> Hash (cost=1.60..1.60 rows=60 width=34)"
"-> Seq Scan on funktion fun (cost=0.00..1.60 rows=60
width=34)"
"-> Index Scan using pk_perfcache180 on perfcache180
cac180 (cost=0.00..4.01 rows=1 width=19)"
" Index Cond: ("outer".tra_id = cac180.tra_id)"
"-> Index Scan using pk_perfcache270 on perfcache270
cac270 (cost=0.00..4.00 rows=1 width=19)"
" Index Cond: ("outer".tra_id = cac270.tra_id)"
"-> Index Scan using pk_indexperfcache90 on
indexperfcache90 incac90 (cost=0.00..3.99 rows=1
width=19)"
"Index Cond: ("outer".tra_id = incac90.tra_id)"
" -> Index Scan using pk_perfcache90 on perfcache90
cac90 (cost=0.00..4.00 rows=1 width=19)"
" Index Cond: ("outer".tra_id = cac90.tra_id)"
"-> Index Scan using pk_person on person per
(cost=0.00..3.96 rows=1 width=27)"
" Index Cond: ("outer".per_id = per.per_id)"


In this case the time taken is much less and also the
index in the tra_date cloumn is considered while with
the view the index is not considered and also other
indexes are not considered.

What is it that i am doing wrong?

Thanks in advance.


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

Re: Performance problem with joins

От
Tom Lane
Дата:
fardeen memon <frank69in@yahoo.com> writes:
> What is it that i am doing wrong?

I think the forced coercion to date type in the view case is preventing
the planner from making a good guess about the selectivity of the
condition on tra_date.  It has stats about tra_date's distribution,
but none about the distribution of "tra_date::date".

            regards, tom lane

Re: Performance problem with joins

От
fardeen memon
Дата:
Thanks for the reply .. you are right after i changed  tra_date to timestamp in the view it considered the index and the performance did increase a bit .. but still compared to the query without the joins its much less .. any idea why?

here is the output of the explain query after changing the tra_date column to timestamp.

"Merge Right Join  (cost=229025.77..231549.17 rows=32995 width=366)"
"Merge Cond: ("outer".wp_id = "inner".wp_id)"
"->  Index Scan using pk_wertpapier on wertpapier wp  (cost=0.00..1132.90 rows=30654 width=12)"
"->  Sort  (cost=229025.77..229108.26 rows=32995 width=366)"
"    Sort Key: fir.wp_id"
"->  Hash Left Join  (cost=190376.33..226549.51 rows=32995 width=366)"
"    Hash Cond: ("outer".fun_id = "inner".fun_id)"
"->  Hash Left Join  (cost=190374.58..226147.09 rows=32995 width=336)"
"    Hash Cond: ("outer".tra_id = "inner".tra_id)"
"->  Merge Right Join  (cost=182608.86..211107.70 rows=32995 width=326)"
"    Merge Cond: ("outer".tra_id = "inner".tra_id)"
"->  Index Scan using uk1_perfcache270 on perfcache270 cac270  (cost=0.00..26360.00 rows=695309 width=19)"
"->  Sort  (cost=182608.86..182691.35 rows=32995 width=315)"
"    Sort Key: tra.tra_id"
"->  Hash Left Join  (cost=143070.31..180132.60 rows=32995 width=315)"
"    Hash Cond: ("outer".tra_id = "inner".tra_id)"
"->  Hash Left Join  (cost=134981.89..163162.72 rows=32995 width=305)"
"    Hash Cond: ("outer".tra_id = "inner".tra_id)"
"->  Merge Right Join  (cost=116451.34..130707.85 rows=32995 width=294)"
"    Merge Cond: ("outer".tra_id = "inner".tra_id)"
"->  Index Scan using pk_indexperfcache90 on indexperfcache90 incac90  (cost=0.00..12969.65 rows=395189 width=18)"
"->  Sort  (cost=116451.34..116533.83 rows=32995 width=284)"
"    Sort Key: tra.tra_id"
"->  Merge Right Join  (cost=80758.34..113975.08 rows=32995 width=284)"
"    Merge Cond: ("outer".tra_id = "inner".tra_id)"
"->  Index Scan using uk1_perfcache90 on perfcache90 cac90  (cost=0.00..30740.84 rows=814044 width=19)"
"->  Sort  (cost=80758.34..80840.83 rows=32995 width=273)"
"    Sort Key: tra.tra_id"
"->  Hash Left Join  (cost=26205.11..78282.08 rows=32995 width=273)"
"    Hash Cond: (('U'::text || ("outer".fir_cusip)::text) = ("inner".fir_cusip)::text)"
"->  Hash Join  (cost=24911.18..75586.30 rows=32995 width=263)"
"    Hash Cond: ("outer".per_id = "inner".per_id)"
"->  Hash Join  (cost=1658.41..40649.44 rows=32995 width=236)"
"    Hash Cond: ("outer".fir_id = "inner".fir_id)"
"->  Bitmap Heap Scan on trade tra  (cost=207.48..38208.67 rows=32995 width=180)"
"    Recheck Cond: (tra_date > '2006-06-30 00:00:00'::timestamp without time zone)"
"->  Bitmap Index Scan on trade_date_index  (cost=0.00..207.48 rows=32995 width=0)"
"    Index Cond: (tra_date > '2006-06-30 00:00:00'::timestamp without time zone)"
"->  Hash  (cost=1374.54..1374.54 rows=30554 width=56)"
"->  Seq Scan on firma fir  (cost=0.00..1374.54 rows=30554 width=56)"
"->  Hash  (cost=22630.62..22630.62 rows=248862 width=27)"
"->  Seq Scan on person per  (cost=0.00..22630.62 rows=248862 width=27)"
"->  Hash  (cost=1234.74..1234.74 rows=23674 width=34)"
"->  Seq Scan on kurs_latest kurl  (cost=0.00..1234.74 rows=23674 width=34)"
"->  Hash  (cost=16590.44..16590.44 rows=776044 width=19)"
"->  Seq Scan on perfcache180 cac180  (cost=0.00..16590.44 rows=776044 width=19)"
"->  Hash  (cost=7137.93..7137.93 rows=380193 width=18)"
"->  Seq Scan on indexperfcache180 incac180  (cost=0.00..7137.93 rows=380193 width=18)"
"->  Hash  (cost=6847.57..6847.57 rows=367257 width=18)"
"->  Seq Scan on indexperfcache270 incac270  (cost=0.00..6847.57 rows=367257 width=18)"
"->  Hash  (cost=1.60..1.60 rows=60 width=34)"
"->  Seq Scan on funktion fun  (cost=0.00..1.60 rows=60 width=34)"


It is still doing a sequence scan on the person , perfcache180 and perfcache270 table and with out the joins it performs a index scan on these tables.

Is something wrong with the view?

once again thanks for your help.

Tom Lane <tgl@sss.pgh.pa.us> wrote:
fardeen memon writes:
> What is it that i am doing wrong?

I think the forced coercion to date type in the view case is preventing
the planner from making a good guess about the selectivity of the
condition on tra_date. It has stats about tra_date's distribution,
but none about the distribution of "tra_date::date".

regards, tom lane


Stay in the know. Pulse on the new Yahoo.com. Check it out.

Re: Performance problem with joins

От
Tom Lane
Дата:
fardeen memon <frank69in@yahoo.com> writes:
>     here is the output of the explain query after changing the tra_date column to timestamp.

If you want intelligent commentary, please (a) post EXPLAIN ANALYZE not
EXPLAIN output, and (b) don't mangle the indentation.  This is just
about unreadable :-(

            regards, tom lane