Обсуждение: Fwd: Help with view performance problem

Поиск
Список
Период
Сортировка

Fwd: Help with view performance problem

От
Chris Hoover
Дата:
Does anyone have any suggestions on this?  I did not get any response
from the admin list.

Thanks,

Chris

---------- Forwarded message ----------
From: Chris Hoover <revoohc@gmail.com>
Date: Jul 27, 2005 12:29 PM
Subject: Re: Help with view performance problem
To: pgsql-admin@postgresql.org


I did some more testing, and ran the explain analyze on the problem.
In my session I did a set enable_hashjoin = false and then ran the
analyze.  This caused it to use the indexes as I have been expecting
it to do.

Now, how can I get it to use the indexes w/o manipulating the
environment?  What make postgresql want to sequentially scan and use a
hash join?

thanks,

Chris

explain analyze with set_hashjoin=false;
prob_db=#explain analyze select * from clm_com;



                            QUERY PLAN




----------------------------------------------------------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan clm_com  (cost=1057975.45..1169021.26 rows=126910
width=366) (actual time=142307.99..225997.22 rows=1268649 loops=1)
   ->  Unique  (cost=1057975.45..1169021.26 rows=126910 width=366)
(actual time=142307.96..206082.30 rows=1268649 loops=1)
         ->  Sort  (cost=1057975.45..1061148.19 rows=1269095
width=366) (actual time=142307.95..156019.01 rows=1268649 loops=1)
               Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr,
clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city,
clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd,
clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix,
clmcom1.pat_zip, clmcom1.payto_addr_1,
clmcom1.payto_addr_2, clmcom1.payto_city, clmcom1.payto_cntry,
clmcom1.payto_f_name, clmcom1.payto_m_name, clmcom1.payto_state,
clmcom1.payto_zip, clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org,
clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd,
clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id,
clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt,
clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1,
clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip
               ->  Merge Join  (cost=0.00..565541.46 rows=1269095
width=366) (actual time=464.89..130638.06 rows=1268649 loops=1)
                     Merge Cond: ("outer".inv_nbr = "inner".inv_nbr)
                     Join Filter: ("outer".inv_qfr = "inner".inv_qfr)
                     ->  Index Scan using clmcom1_inv_nbr_iview_idx on
clmcom1  (cost=0.00..380534.32 rows=1269095 width=270) (actual
time=0.27..82159.37 rows=1268649 loops=1)
                     ->  Index Scan using clmcom2_inv_nbr_iview_idx on
clmcom2  (cost=0.00..159636.25 rows=1271198 width=96) (actual
time=464.56..21774.02 rows=1494019 loops=1)
 Total runtime: 227369.39 msec
(10 rows)



On 7/27/05, Chris Hoover <revoohc@gmail.com> wrote:
> I am having a problem with a view on one of my db's.  This view is
> trying to sequentially can the 2 tables it is accessing.  However,
> when I explain the view on most of my other db's (all have the same
> schema's), it is using the indexes.  Can anyone please help me
> understand why postgres is choosing to sequenially scan both tables?
>
> Both tables in the view have a primary key defined on inv_nbr,
> inv_qfr.  Vacuum and analyze have been run on the tables in question
> to try and make sure stats are up to date.
>
> Thanks,
>
> Chris
> PG - 7.3.4
> RH 2.1
>
>
> Here is the view definition:
> SELECT DISTINCT clmcom1.inv_nbr AS inventory_number,
>                               clmcom1.inv_qfr AS inventory_qualifier,
>                               clmcom1.pat_addr_1 AS patient_address_1,
>                               clmcom1.pat_addr_2 AS patient_address_2,
>                               clmcom1.pat_city AS patient_city,
>                               clmcom1.pat_cntry AS patient_country,
>                               clmcom1.pat_dob AS patient_date_of_birth,
>                               clmcom1.pat_gender_cd AS patient_gender_code,
>                               clmcom1.pat_info_pregnancy_ind AS pregnancy_ind,
>                               clmcom1.pat_state AS patient_state,
>                               clmcom1.pat_suffix AS patient_suffix,
>                               clmcom1.pat_zip AS patient_zip_code,
>                               clmcom1.payto_addr_1 AS payto_address_1,
>                               clmcom1.payto_addr_2 AS payto_address_2,
>                               clmcom1.payto_city,
>                               clmcom1.payto_cntry AS payto_country,
>                               clmcom1.payto_f_name AS payto_first_name,
>                               clmcom1.payto_m_name AS payto_middle_name,
>                               clmcom1.payto_state,
>                               clmcom1.payto_zip AS payto_zip_code,
>                               clmcom1.clm_tot_clm_chgs AS total_claim_charge,
>                               clmcom1.bill_l_name_org AS
> billing_last_name_or_org,
>                               clmcom1.clm_delay_rsn_cd AS
> claim_delay_reason_code,
>                               clmcom1.clm_submit_rsn_cd AS
> claim_submit_reason_code,
>                               clmcom1.payto_l_name_org AS
> payto_last_name_or_org,
>                               clmcom1.payto_prim_id AS payto_primary_id,
>                               clmcom1.bill_prim_id AS billing_prov_primary_id,
>                               clmcom1.clm_tot_ncov_chgs AS total_ncov_charge,
>                               clmcom2.contract_amt AS contract_amount,
>                               clmcom2.svc_fac_or_lab_name,
>                               clmcom2.svc_fac_addr_1 AS svc_fac_address_1,
>                               clmcom2.svc_fac_addr_2 AS svc_fac_address_2,
>                               clmcom2.svc_fac_city,
>                               clmcom2.svc_fac_zip AS svc_fac_zip_code
> FROM (clmcom1 LEFT JOIN clmcom2 ON (((clmcom1.inv_nbr =
> clmcom2.inv_nbr) AND
>
> (clmcom1.inv_qfr = clmcom2.inv_qfr))))
> ORDER BY clmcom1.inv_nbr,
>                   clmcom1.inv_qfr,
>                   clmcom1.pat_addr_1,
>                   clmcom1.pat_addr_2,
>                   clmcom1.pat_city,
>                   clmcom1.pat_cntry,
>                   clmcom1.pat_dob,
>                   clmcom1.pat_gender_cd,
>                   clmcom1.pat_info_pregnancy_ind,
>                   clmcom1.pat_state,
>                   clmcom1.pat_suffix,
>                   clmcom1.pat_zip,
>                   clmcom1.payto_addr_1,
>                   clmcom1.payto_addr_2,
>                   clmcom1.payto_city,
>                   clmcom1.payto_cntry,
>                   clmcom1.payto_f_name,
>                   clmcom1.payto_m_name,
>                   clmcom1.payto_state,
>                   clmcom1.payto_zip,
>                   clmcom1.clm_tot_clm_chgs,
>                   clmcom1.bill_l_name_org,
>                   clmcom1.clm_delay_rsn_cd,
>                   clmcom1.clm_submit_rsn_cd,
>                   clmcom1.payto_l_name_org,
>                   clmcom1.payto_prim_id,
>                   clmcom1.bill_prim_id,
>                   clmcom1.clm_tot_ncov_chgs,
>                   clmcom2.contract_amt,
>                   clmcom2.svc_fac_or_lab_name,
>                   clmcom2.svc_fac_addr_1,
>                   clmcom2.svc_fac_addr_2,
>                   clmcom2.svc_fac_city,
>                   clmcom2.svc_fac_zip;
>
>
> Here is the explain analyze from the problem db:
> prob_db=# explain analyze select * from clm_com;
>
>
>
>                           QUERY PLAN
>
>
>
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Subquery Scan clm_com  (cost=1039824.35..1150697.61 rows=126712
> width=367) (actual time=311792.78..405819.03 rows=1266114 loops=1)
>    ->  Unique  (cost=1039824.35..1150697.61 rows=126712 width=367)
> (actual time=311792.74..386313.14 rows=1266114 loops=1)
>          ->  Sort  (cost=1039824.35..1042992.16 rows=1267123
> width=367) (actual time=311792.74..338189.48 rows=1266114 loops=1)
>                Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr,
> clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city,
> clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd,
> clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix,
> clmcom1.pat_zip, clmcom1.payto_addr_1,
> clmcom1.payto_addr_2, clmcom1.payto_city, clmcom1.payto_cntry,
> clmcom1.payto_f_name, clmcom1.payto_m_name, clmcom1.payto_state,
> clmcom1.payto_zip, clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org,
> clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd,
> clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id,
> clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt,
> clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1,
> clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip
>                ->  Hash Join  (cost=132972.78..548171.70 rows=1267123
> width=367) (actual time=16999.32..179359.43 rows=1266114 loops=1)
>                      Hash Cond: ("outer".inv_nbr = "inner".inv_nbr)
>                      Join Filter: ("outer".inv_qfr = "inner".inv_qfr)
>                      ->  Seq Scan on clmcom1  (cost=0.00..267017.23
> rows=1267123 width=271) (actual time=0.11..84711.83 rows=1266114
> loops=1)
>                      ->  Hash  (cost=111200.82..111200.82 rows=1269582
> width=96) (actual time=16987.45..16987.45 rows=0 loops=1)
>                            ->  Seq Scan on clmcom2
> (cost=0.00..111200.82 rows=1269582 width=96) (actual
> time=0.07..12164.81 rows=1266108 loops=1)
>  Total runtime: 407317.47 msec
> (11 rows)
> ~
>
>
>
> Here is the explain analyze from a good db (on the same postgres cluster);
> good_db=# explain analyze select * from clm_com;
>
>
>
>
>
>                             QUERY PLAN
>
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>  Subquery Scan clm_com  (cost=78780.59..89498.29 rows=12249 width=359)
> (actual time=73045.36..79974.37 rows=122494 loops=1)
>    ->  Unique  (cost=78780.59..89498.29 rows=12249 width=359) (actual
> time=73045.28..78031.99 rows=122494 loops=1)
>          ->  Sort  (cost=78780.59..79086.81 rows=122488 width=359)
> (actual time=73045.28..73362.94 rows=122494 loops=1)
>                Sort Key: clmcom1.inv_nbr, clmcom1.inv_qfr,
> clmcom1.pat_addr_1, clmcom1.pat_addr_2, clmcom1.pat_city,
> clmcom1.pat_cntry, clmcom1.pat_dob, clmcom1.pat_gender_cd,
> clmcom1.pat_info_pregnancy_ind, clmcom1.pat_state, clmcom1.pat_suffix,
> clmcom1.pat_zip, clmcom1.payto_addr_1, clmcom1.payto_addr_2,
> clmcom1.payto_city, clmcom1.payto_cntry, clmcom1.payto_f_name,
> clmcom1.payto_m_name, clmcom1.payto_state, clmcom1.payto_zip,
> clmcom1.clm_tot_clm_chgs, clmcom1.bill_l_name_org,
> clmcom1.clm_delay_rsn_cd, clmcom1.clm_submit_rsn_cd,
> clmcom1.payto_l_name_org, clmcom1.payto_prim_id, clmcom1.bill_prim_id,
> clmcom1.clm_tot_ncov_chgs, clmcom2.contract_amt,
> clmcom2.svc_fac_or_lab_name, clmcom2.svc_fac_addr_1,
> clmcom2.svc_fac_addr_2, clmcom2.svc_fac_city, clmcom2.svc_fac_zip
>                ->  Merge Join  (cost=0.00..56945.12 rows=122488
> width=359) (actual time=54.76..71635.65 rows=122494 loops=1)
>                      Merge Cond: (("outer".inv_nbr = "inner".inv_nbr)
> AND ("outer".inv_qfr = "inner".inv_qfr))
>                      ->  Index Scan using clmcom1_pkey on clmcom1
> (cost=0.00..38645.61 rows=122488 width=267) (actual
> time=25.60..49142.16 rows=122494 loops=1)
>                      ->  Index Scan using clmcom2_pkey on clmcom2
> (cost=0.00..16004.08 rows=122488 width=92) (actual
> time=29.09..19418.94 rows=122494 loops=1)
>  Total runtime: 80162.26 msec
> (9 rows)
>

Re: Fwd: Help with view performance problem

От
Dan Harris
Дата:
On Jul 28, 2005, at 8:38 AM, Chris Hoover wrote:
>
>
> I did some more testing, and ran the explain analyze on the problem.
> In my session I did a set enable_hashjoin = false and then ran the
> analyze.  This caused it to use the indexes as I have been expecting
> it to do.
>
> Now, how can I get it to use the indexes w/o manipulating the
> environment?  What make postgresql want to sequentially scan and use a
> hash join?
>
> thanks,
>
> Chris
>
> explain analyze with set_hashjoin=false;
> prob_db=#explain analyze select * from clm_com;
>
>

I had something similar to this happen recently.  The planner was
choosing a merge join and seq scan because my 'random_page_cost' was
set too high.  I had it at 3 , and ended up settling at 1.8 to get it
to correctly use my indices.  Once that change was in place, the
planner did the 'right' thing for me.

Not sure if this will help you, but it sounds similar.

-Dan

Re: Fwd: Help with view performance problem

От
Chris Hoover
Дата:
I'm alreading running at 1.5.  It looks like if I drop the
random_page_cost t0 1.39, it starts using the indexes.  Are there any
unseen issues with dropping the random_page_cost this low?

Thanks,

Chris

On 7/28/05, Dan Harris <fbsd@drivefaster.net> wrote:
>
> On Jul 28, 2005, at 8:38 AM, Chris Hoover wrote:
> >
> >
> > I did some more testing, and ran the explain analyze on the problem.
> > In my session I did a set enable_hashjoin = false and then ran the
> > analyze.  This caused it to use the indexes as I have been expecting
> > it to do.
> >
> > Now, how can I get it to use the indexes w/o manipulating the
> > environment?  What make postgresql want to sequentially scan and use a
> > hash join?
> >
> > thanks,
> >
> > Chris
> >
> > explain analyze with set_hashjoin=false;
> > prob_db=#explain analyze select * from clm_com;
> >
> >
>
> I had something similar to this happen recently.  The planner was
> choosing a merge join and seq scan because my 'random_page_cost' was
> set too high.  I had it at 3 , and ended up settling at 1.8 to get it
> to correctly use my indices.  Once that change was in place, the
> planner did the 'right' thing for me.
>
> Not sure if this will help you, but it sounds similar.
>
> -Dan
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>