Re: Query performance issue

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема Re: Query performance issue
Дата
Msg-id 20210122023514.GD27167@telsasoft.com
обсуждение исходный текст
Ответ на Query performance issue  (Nagaraj Raj <nagaraj.sf@yahoo.com>)
Ответы Re: Query performance issue  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список pgsql-performance
On Fri, Jan 22, 2021 at 01:53:26AM +0000, Nagaraj Raj wrote:
> Tables ddl are attached in dbfiddle -- Postgres 11 | db<>fiddle
> Postgres 11 | db<>fiddle
> Server configuration is: Version: 10.11RAM - 320GBvCPU - 32 "maintenance_work_mem" 256MB"work_mem"            
1GB"shared_buffers"64GB
 

> Aggregate  (cost=31.54..31.55 rows=1 width=8) (actual time=0.010..0.012 rows=1 loops=1)
>   ->  Nested Loop  (cost=0.00..31.54 rows=1 width=8) (actual time=0.007..0.008 rows=0 loops=1)
>         Join Filter: (a.household_entity_proxy_id = c.household_entity_proxy_id)
>         ->  Nested Loop  (cost=0.00..21.36 rows=1 width=16) (actual time=0.006..0.007 rows=0 loops=1)
>               Join Filter: (a.individual_entity_proxy_id = b.individual_entity_proxy_id)
>               ->  Seq Scan on prospect a  (cost=0.00..10.82 rows=1 width=16) (actual time=0.006..0.006 rows=0
loops=1)
>                     Filter: (((last_contacted_anychannel_dttm IS NULL) OR (last_contacted_anychannel_dttm <
'2020-11-2300:00:00'::timestamp without time zone)) AND (shared_paddr_with_customer_ind = 'N'::bpchar) AND
(profane_wrd_ind= 'N'::bpchar) AND (tmo_ofnsv_name_ind = 'N'::bpchar) AND (has_individual_address = 'Y'::bpchar) AND
(has_last_name= 'Y'::bpchar) AND (has_first_name = 'Y'::bpchar))
 
>               ->  Seq Scan on individual_demographic b  (cost=0.00..10.53 rows=1 width=8) (never executed)
>                     Filter: ((tax_bnkrpt_dcsd_ind = 'N'::bpchar) AND (govt_prison_ind = 'N'::bpchar) AND
((cstmr_prspct_ind)::text= 'Prospect'::text))
 
>         ->  Seq Scan on household_demographic c  (cost=0.00..10.14 rows=3 width=8) (never executed)
>               Filter: (((hspnc_lang_prfrnc_cval)::text = ANY ('{B,E,X}'::text[])) OR (hspnc_lang_prfrnc_cval IS
NULL))
> Planning Time: 1.384 ms
> Execution Time: 0.206 ms
> 13 rows

It's doing nested loops with estimated rowcount=1, which indicates a bad
underestimate, and suggests that the conditions are redundant or correlated.

Maybe you can handle this with MV stats on the correlated columns:

CREATE STATISTICS prospect_stats (dependencies) ON
    shared_paddr_with_customer_ind, profane_wrd_ind, tmo_ofnsv_name_ind, has_individual_address, has_last_name,
has_first_name
    FROM prospect;
CREATE STATISTICS individual_demographic_stats (dependencies) ON
    tax_bnkrpt_dcsd_ind, govt_prison_ind, cstmr_prspct_ind
    FROM individual_demographic_stats 
ANALYZE prospect, individual_demographic_stats ;

Since it's expensive to compute stats on large number of columns, I'd then
check *which* are correlated and then only compute MV stats on those.  This
will show col1=>col2: X where X approaches 1, the conditions are highly
correlated:
SELECT * FROM pg_statistic_ext; -- pg_statistic_ext_data since v12

Also, as a diagnostic tool to get "explain analyze" to finish, you can
SET enable_nestloop=off;

-- 
Justin



В списке pgsql-performance по дате отправления:

Предыдущее
От: Nagaraj Raj
Дата:
Сообщение: Query performance issue
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Query performance issue