Обсуждение: [PERFORM] query runs for more than 24 hours!

Поиск
Список
Период
Сортировка

[PERFORM] query runs for more than 24 hours!

От
Mariel Cherkassky
Дата:

Hi, I have a query that I run in my postgresql 9.6 database and it runs for more than 24 hours and doesnt finish.

My select consist from few joins :

SELECT a.inst_prod_id,
product_id,
nap_area2,
nap_phone_num,
nap_product_id,
b.nap_discount_num,
b.nap_makat_cd,
nap_act_start_dt,
b.nap_debt_line, nap_act_end_dt,
b.row_added_dttm
b.row_lastmant_dttm, FROM ps_rf_inst_prod a,
AND a.setid || ''= 'SHARE'
nap_ip_discount b WHERE nap_crm_status = 'C_04'
AND b.nap_makat_cd IN (SELECT term_code
AND b.setid || ''= 'SHARE' AND a.inst_prod_id = b.inst_prod_id
AND start_date <= b.nap_rishum_date
FROM tv_finterm WHERE pricing_method_code in ('2', '4')
AND coalesce(end_date, to_date('01/01/2095','DD/MM/YYYY')) !=
AND coalesce(end_date, to_date('01/01/2095','DD/MM/YYYY')) >= b.nap_rishum_date start_date)
AND (b.row_lastmant_dttm > to_date('01/01/2005','DD/MM/YYYY') OR
AND b.nap_act_end_dt > clock_timestamp() AND TRUNC(b.nap_act_start_dt) < TRUNC(b.nap_act_end_dt) b.nap_rishum_date > to_date('01/01/2005','DD/MM/YYYY') OR
WHERE PERCENT IS NOT NULL
b.row_added_dttm > to_date('01/01/2005','DD/MM/YYYY')) AND b.nap_discount_num IN (SELECT k.discount_line FROM tv_discounts_details k AND k.start_month = 1)
AND c.phone = a.nap_phone_num
AND (NOT EXISTS(SELECT /*+index(c TC_FINTERMS_I_SERVICE) */ 1 FROM tc_finterms c WHERE c.area = a.nap_area2 AND c.term_code = b.nap_makat_cd
WHERE service_uid = (a.inst_prod_id)::integer
AND deb_cred_line_no = b.nap_debt_line AND (payment_end_date > clock_timestamp()) AND term_type = '2') OR NOT EXISTS(SELECT 1 FROM ip_service_discounts AND service_code = b.nap_makat_cd
and b.nap_purch_instprod = ' ';
AND discount_code = b.nap_discount_num
AND (end_date IS NULL OR coalesce(discount_end_date, clock_timestamp() + interval '1 days') > clock_timestamp())))

Before trying to work on performance I checked locks and nothing returned :

=# select a1.query as blocking_query, a2. query as waiting_query,
t.schemaname ||'.'||t.relname as locked_table from pg_stat_activity
a1 join pg_locks p1 on a1. pid = p1.pid and p1.granted join pg_locks
pg_stat_activity a2 on a2. pid = p2.pid join pg_stat_all_tables t on
p2 on p1.relation = p2.relation and not p2.granted join p1.relation = t.relid;
(0 rows)
blocking_query | waiting_query | locked_table
----------------+---------------+--------------

I checked the explain plan of my query :

Nested Loop Semi Join (cost=0.43..7565655389.26 rows=1 width=93)
Join Filter: (b.nap_discount_num = (k.discount_line)::numeric)
-> Seq Scan on ps_rf_inst_prod a (cost=0.00..4337158.91 rows=40452 width=41)
-> Nested Loop (cost=0.43..7565653159.07 rows=2 width=93)
-> Index Scan using ps_nap_ip_discount on nap_ip_discount b (cost=0.43..186920.69 rows=1 width=60)
Filter: (((nap_crm_status)::text = 'C_04'::text) AND (((setid)::text || ''::text) = 'SHARE'::text))
Filter: (((nap_purch_instprod)::text = ' '::text) AND (nap_act_end_dt > clock_timestamp()) AND (((setid)::text || ''::t
Index Cond: ((inst_prod_id)::text = (a.inst_prod_id)::text)
e('01/01/2005'::text, 'DD/MM/YYYY'::text)) OR (nap_rishum_date > to_date('01/01/2005'::text, 'DD/MM/YYYY'::text)) OR (row_added_dttm >
ext) = 'SHARE'::text) AND (trunc(nap_act_start_dt, 'DDD'::text) < trunc(nap_act_end_dt, 'DDD'::text)) AND ((row_lastmant_dttm > to_dat
-> Index Scan using tc_finterms_ix1 on tc_finterms c (cost=0.56..8.60 rows=1 width=0)
to_date('01/01/2005'::text, 'DD/MM/YYYY'::text))) AND ((NOT (SubPlan 2)) OR (NOT (SubPlan 3))) AND (SubPlan 1)) SubPlan 2
b.nap_makat_cd) AND (deb_cred_line_no = (b.nap_debt_line)::double precision))
Index Cond: (((area)::text = (a.nap_area2)::text) AND ((phone)::text = (a.nap_phone_num)::text)) Filter: (((term_type)::text = '2'::text) AND (payment_end_date > clock_timestamp()) AND ((term_code)::numeric =
Filter: (((service_code)::numeric = b.nap_makat_cd) AND ((discount_code)::numeric = b.nap_discount_num) AND ((e
SubPlan 3 -> Index Scan using ip_service_discounts_pkey on ip_service_discounts (cost=0.56..10.78 rows=1 width=0) Index Cond: (service_uid = (a.inst_prod_id)::integer)
Recheck Cond: (((pricing_method_code)::text = ANY ('{2,4}'::text[])) AND (start_date <= b.nap_rishum_date))
nd_date IS NULL) OR (COALESCE((discount_end_date)::timestamp with time zone, (clock_timestamp() + '1 day'::interval)) > clock_timestam p()))) SubPlan 1 -> Bitmap Heap Scan on tv_finterm (cost=2290.83..17301.61 rows=26907 width=4)
-> Bitmap Index Scan on index_test_mariel (cost=0.00..2284.11 rows=81126 width=0)
Filter: ((COALESCE(end_date, (to_date('01/01/2095'::text, 'DD/MM/YYYY'::text))::timestamp without time zone) >=b.nap_rishum_date) AND (COALESCE(end_date, (to_date('01/01/2095'::text, 'DD/MM/YYYY'::text))::timestamp without time zone) <> start_d ate))
(25 rows)
Index Cond: (((pricing_method_code)::text = ANY ('{2,4}'::text[])) AND (start_date <= b.nap_rishum_date)) -> Materialize (cost=0.00..1407.38 rows=43933 width=4) -> Seq Scan on tv_discounts_details k (cost=0.00..1187.71 rows=43933 width=4)
Filter: ((percent IS NOT NULL) AND (start_month = 1))

I run vacuum analyze database before running the query. Some info about the tables :

ps_rf_inst_prod - 32G
nap_ip_discount-1G
tv_finterm - 100M
tc_finterms - 6G
TV_FINTERM - 1G

This query is part of an app that I migrated from oracle to postgresql. I dont want to change the query much, looking for a way to change the plan to make it faster.. I have indexes on ps_rf_inst_prod, when I delete the pipelines in :

AND a.setid || ''= 'SHARE'
AND b.setid || ''= 'SHARE'

the plan is changing and it uses indexes on ps_rf_inst_prod but it costs more and the performance are worse.

Please , HELP...

Re: [PERFORM] query runs for more than 24 hours!

От
Tomas Vondra
Дата:

On 08/22/2017 04:23 PM, Mariel Cherkassky wrote:
> Hi, I have a query that I run in my postgresql 9.6 database and it runs
> for more than 24 hours and doesnt finish.
>
> My select consist from few joins :
>

I'm sorry, but the query and plans are completely broken (wrapped in
funny ways, missing important bits. ...) I don't know what client you
use or how that happened, but I recommend attaching the information as
text files instead of pasting it into the message directly.

Regarding the query analysis - we can't really help you much without
seeing an explain analyze (that is, not just the plan and estimates, but
actual performance and row counts). That usually identifies the query
operations (scans, join, ...) causing issues.

Of course, if the query is already running for 24h and you don't know
how much longer it will take to complete, running EXPLAIN ANALYZE on it
is not very practical. The best thing you can do is break the query into
smaller parts and debugging that - start with one table, and then add
tables/conditions until the performance gets bad. Hopefully the explain
analyze on that will complete in reasonable time.

Of course, you haven't told us anything about what's happening on the
machine. It is reading a lot of data from the disks? Random or
sequential? Is it writing a lot of data into temporary files? Is it
consuming a lot of CPU? And so on.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PERFORM] query runs for more than 24 hours!

От
vinny
Дата:
On 2017-08-22 16:23, Mariel Cherkassky wrote:

>
> SELECT a.inst_prod_id,
>                                            product_id,
>                           nap_area2,
>                                            nap_phone_num,
>                              nap_product_id,
>                                            b.nap_discount_num,
>                                   b.nap_makat_cd,
>                                            nap_act_start_dt,
>                                 b.nap_debt_line,
>                                            nap_act_end_dt,
>                                            b.row_added_dttm
>                                b.row_lastmant_dttm,
>                                     FROM   ps_rf_inst_prod a,
>                                     AND    a.setid  || ''= 'SHARE'
>                                       nap_ip_discount b
>                                     WHERE  nap_crm_status  = 'C_04'
>                                     AND    b.nap_makat_cd IN (SELECT
> term_code                                    AND    b.setid  || ''=
> 'SHARE'
>                                     AND    a.inst_prod_id =


On my screen the order of the lines in the query seem to get messed up,
I'm not sure if that's my email program or a copy/paste error.

 From what I can see, you are using subselects in an IN statement,
which can be a problem if that has to be re-evaluated a lot.

It's hard for me to say more because I can't tell what the actual query
is at the moment.

Regards, Vincent.