Обсуждение: [PERFORM] query runs for more than 24 hours!
Hi, I have a query that I run in my postgresql 9.6 database and it runs for more than 24 hours and doesnt finish.
My select consist from few joins :
SELECT a.inst_prod_id,
product_id,nap_area2,nap_phone_num,nap_product_id,b.nap_discount_num,b.nap_makat_cd,nap_act_start_dt,b.nap_debt_line, nap_act_end_dt,b.row_added_dttmb.row_lastmant_dttm, FROM ps_rf_inst_prod a,AND a.setid || ''= 'SHARE'nap_ip_discount b WHERE nap_crm_status = 'C_04'AND b.nap_makat_cd IN (SELECT term_codeAND b.setid || ''= 'SHARE' AND a.inst_prod_id = b.inst_prod_idAND start_date <= b.nap_rishum_dateFROM tv_finterm WHERE pricing_method_code in ('2', '4')AND coalesce(end_date, to_date('01/01/2095','DD/MM/YYYY')) !=AND coalesce(end_date, to_date('01/01/2095','DD/MM/YYYY')) >= b.nap_rishum_date start_date)AND (b.row_lastmant_dttm > to_date('01/01/2005','DD/MM/YYYY') ORAND b.nap_act_end_dt > clock_timestamp() AND TRUNC(b.nap_act_start_dt) < TRUNC(b.nap_act_end_dt) b.nap_rishum_date > to_date('01/01/2005','DD/MM/YYYY') ORWHERE PERCENT IS NOT NULLb.row_added_dttm > to_date('01/01/2005','DD/MM/YYYY')) AND b.nap_discount_num IN (SELECT k.discount_line FROM tv_discounts_details k AND k.start_month = 1)AND c.phone = a.nap_phone_numAND (NOT EXISTS(SELECT /*+index(c TC_FINTERMS_I_SERVICE) */ 1 FROM tc_finterms c WHERE c.area = a.nap_area2 AND c.term_code = b.nap_makat_cdWHERE service_uid = (a.inst_prod_id)::integerAND deb_cred_line_no = b.nap_debt_line AND (payment_end_date > clock_timestamp()) AND term_type = '2') OR NOT EXISTS(SELECT 1 FROM ip_service_discounts AND service_code = b.nap_makat_cdand b.nap_purch_instprod = ' ';AND discount_code = b.nap_discount_numAND (end_date IS NULL OR coalesce(discount_end_date, clock_timestamp() + interval '1 days') > clock_timestamp())))
Before trying to work on performance I checked locks and nothing returned :
=# select a1.query as blocking_query, a2. query as waiting_query,
t.schemaname ||'.'||t.relname as locked_table from pg_stat_activitya1 join pg_locks p1 on a1. pid = p1.pid and p1.granted join pg_lockspg_stat_activity a2 on a2. pid = p2.pid join pg_stat_all_tables t onp2 on p1.relation = p2.relation and not p2.granted join p1.relation = t.relid;(0 rows)blocking_query | waiting_query | locked_table----------------+---------------+--------------
I checked the explain plan of my query :
Nested Loop Semi Join (cost=0.43..7565655389.26 rows=1 width=93)Join Filter: (b.nap_discount_num = (k.discount_line)::numeric)
-> Seq Scan on ps_rf_inst_prod a (cost=0.00..4337158.91 rows=40452 width=41)-> Nested Loop (cost=0.43..7565653159.07 rows=2 width=93)-> Index Scan using ps_nap_ip_discount on nap_ip_discount b (cost=0.43..186920.69 rows=1 width=60)Filter: (((nap_crm_status)::text = 'C_04'::text) AND (((setid)::text || ''::text) = 'SHARE'::text))Filter: (((nap_purch_instprod)::text = ' '::text) AND (nap_act_end_dt > clock_timestamp()) AND (((setid)::text || ''::tIndex Cond: ((inst_prod_id)::text = (a.inst_prod_id)::text)e('01/01/2005'::text, 'DD/MM/YYYY'::text)) OR (nap_rishum_date > to_date('01/01/2005'::text, 'DD/MM/YYYY'::text)) OR (row_added_dttm >ext) = 'SHARE'::text) AND (trunc(nap_act_start_dt, 'DDD'::text) < trunc(nap_act_end_dt, 'DDD'::text)) AND ((row_lastmant_dttm > to_dat-> Index Scan using tc_finterms_ix1 on tc_finterms c (cost=0.56..8.60 rows=1 width=0)to_date('01/01/2005'::text, 'DD/MM/YYYY'::text))) AND ((NOT (SubPlan 2)) OR (NOT (SubPlan 3))) AND (SubPlan 1)) SubPlan 2b.nap_makat_cd) AND (deb_cred_line_no = (b.nap_debt_line)::double precision))Index Cond: (((area)::text = (a.nap_area2)::text) AND ((phone)::text = (a.nap_phone_num)::text)) Filter: (((term_type)::text = '2'::text) AND (payment_end_date > clock_timestamp()) AND ((term_code)::numeric =Filter: (((service_code)::numeric = b.nap_makat_cd) AND ((discount_code)::numeric = b.nap_discount_num) AND ((eSubPlan 3 -> Index Scan using ip_service_discounts_pkey on ip_service_discounts (cost=0.56..10.78 rows=1 width=0) Index Cond: (service_uid = (a.inst_prod_id)::integer)Recheck Cond: (((pricing_method_code)::text = ANY ('{2,4}'::text[])) AND (start_date <= b.nap_rishum_date))nd_date IS NULL) OR (COALESCE((discount_end_date)::timestamp with time zone, (clock_timestamp() + '1 day'::interval)) > clock_timestam p()))) SubPlan 1 -> Bitmap Heap Scan on tv_finterm (cost=2290.83..17301.61 rows=26907 width=4)-> Bitmap Index Scan on index_test_mariel (cost=0.00..2284.11 rows=81126 width=0)Filter: ((COALESCE(end_date, (to_date('01/01/2095'::text, 'DD/MM/YYYY'::text))::timestamp without time zone) >=b.nap_rishum_date) AND (COALESCE(end_date, (to_date('01/01/2095'::text, 'DD/MM/YYYY'::text))::timestamp without time zone) <> start_d ate))(25 rows)Index Cond: (((pricing_method_code)::text = ANY ('{2,4}'::text[])) AND (start_date <= b.nap_rishum_date)) -> Materialize (cost=0.00..1407.38 rows=43933 width=4) -> Seq Scan on tv_discounts_details k (cost=0.00..1187.71 rows=43933 width=4)Filter: ((percent IS NOT NULL) AND (start_month = 1))
I run vacuum analyze database
before running the query. Some info about the tables :
ps_rf_inst_prod - 32Gnap_ip_discount-1G
tv_finterm - 100Mtc_finterms - 6GTV_FINTERM - 1G
This query is part of an app that I migrated from oracle to postgresql. I dont want to change the query much, looking for a way to change the plan to make it faster.. I have indexes on ps_rf_inst_prod, when I delete the pipelines in :
AND a.setid || ''= 'SHARE'
AND b.setid || ''= 'SHARE'
the plan is changing and it uses indexes on ps_rf_inst_prod but it costs more and the performance are worse.
Please , HELP...
On 08/22/2017 04:23 PM, Mariel Cherkassky wrote: > Hi, I have a query that I run in my postgresql 9.6 database and it runs > for more than 24 hours and doesnt finish. > > My select consist from few joins : > I'm sorry, but the query and plans are completely broken (wrapped in funny ways, missing important bits. ...) I don't know what client you use or how that happened, but I recommend attaching the information as text files instead of pasting it into the message directly. Regarding the query analysis - we can't really help you much without seeing an explain analyze (that is, not just the plan and estimates, but actual performance and row counts). That usually identifies the query operations (scans, join, ...) causing issues. Of course, if the query is already running for 24h and you don't know how much longer it will take to complete, running EXPLAIN ANALYZE on it is not very practical. The best thing you can do is break the query into smaller parts and debugging that - start with one table, and then add tables/conditions until the performance gets bad. Hopefully the explain analyze on that will complete in reasonable time. Of course, you haven't told us anything about what's happening on the machine. It is reading a lot of data from the disks? Random or sequential? Is it writing a lot of data into temporary files? Is it consuming a lot of CPU? And so on. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2017-08-22 16:23, Mariel Cherkassky wrote: > > SELECT a.inst_prod_id, > product_id, > nap_area2, > nap_phone_num, > nap_product_id, > b.nap_discount_num, > b.nap_makat_cd, > nap_act_start_dt, > b.nap_debt_line, > nap_act_end_dt, > b.row_added_dttm > b.row_lastmant_dttm, > FROM ps_rf_inst_prod a, > AND a.setid || ''= 'SHARE' > nap_ip_discount b > WHERE nap_crm_status = 'C_04' > AND b.nap_makat_cd IN (SELECT > term_code AND b.setid || ''= > 'SHARE' > AND a.inst_prod_id = On my screen the order of the lines in the query seem to get messed up, I'm not sure if that's my email program or a copy/paste error. From what I can see, you are using subselects in an IN statement, which can be a problem if that has to be re-evaluated a lot. It's hard for me to say more because I can't tell what the actual query is at the moment. Regards, Vincent.