Обсуждение: [PERFORM] Query is very much slow

Поиск
Список
Период
Сортировка

[PERFORM] Query is very much slow

От
Daulat Ram
Дата:

Hi team,

 

We are getting very slow response of this query.

 

SELECT date_trunc('day', insert_time),workflow.project.project_name,

workflow.tool_performance.project_id,workflow.tool_performance.user_id,workflow.tool_performance.step_id,count(*),

round(sum(execution_time)/1000) as Sum_time_sec,round(((round(sum(execution_time)/1000))/60)/count(*),2) as Efficency_Min,round

(((round(sum(execution_time)/1000)))/count(*),2)

as Efficency_sec FROM workflow.project,workflow.tool_performance,workflow.evidence_to_do WHERE workflow.project

.project_id = workflow.tool_performance.project_id AND insert_time >'2017-05-01' AND insert_time <'2017-05-02' AND

workflow.evidence_to_do.status_id in (15100,15150,15200,15300,15400,15500) Group BY date_trunc('day', insert_time),workflow.project.project_name,

workflow.tool_performance.project_id,workflow.tool_performance.user_id,workflow.tool_performance.step_id ORDER BY

workflow.tool_performance.project_id,workflow.project.project_name,

workflow.tool_performance.step_id;

 

The following indexes are created on project & evidence_to_do table.

 

"CREATE INDEX project_id_idx ON workflow.project USING btree (project_id)"

"CREATE INDEX evidence_to_do_status_id_index ON workflow.evidence_to_do USING btree (status_id)"

 

 

Explain plan of the Query is:

 

"GroupAggregate  (cost=18675703613.60..20443753075.67 rows=6689718 width=69)"

"  ->  Sort  (cost=18675703613.60..18823015982.33 rows=58924947492 width=69)"

"        Sort Key: tool_performance.project_id, project.project_name, tool_performance.step_id, (date_trunc('day'::text, tool_performance.insert_time)), tool_performance.user_id"

"        ->  Nested Loop  (cost=2.42..884042104.67 rows=58924947492 width=69)"

"              ->  Seq Scan on evidence_to_do  (cost=0.00..118722.17 rows=554922 width=0)"

"                    Filter: (status_id = ANY ('{15100,15150,15200,15300,15400,15500}'::bigint[]))"

"              ->  Materialize  (cost=2.42..49435.58 rows=106186 width=69)"

"                    ->  Hash Join  (cost=2.42..48904.65 rows=106186 width=69)"

"                          Hash Cond: (tool_performance.project_id = project.project_id)"

"                          ->  Seq Scan on tool_performance  (cost=0.00..47442.18 rows=106186 width=39)"

"                                Filter: ((insert_time > '2017-05-01 00:00:00+05:30'::timestamp with time zone) AND (insert_time < '2017-05-02 00:00:00+05:30'::timestamp with time zone))"

"                          ->  Hash  (cost=1.63..1.63 rows=63 width=38)"

"                                ->  Seq Scan on project  (cost=0.00..1.63 rows=63 width=38)"

 

 

We have 64 GB of RAM &

 

CPU(s):                40

Thread(s) per core:    2

Core(s) per socket:    10

Socket(s):             2



PostgreSQL.conf parameter:

shared_buffers =16GB

work_mem =32MB

 

Would you please help how we can tune this query at database & code level.

 

Regards Daulat

Re: [PERFORM] Query is very much slow

От
Andreas Kretschmer
Дата:



Am 25.05.2017 um 07:13 schrieb Daulat Ram:

Hi team,

 

We are getting very slow response of this query.

 

SELECT date_trunc('day', insert_time),workflow.project.project_name,

workflow.tool_performance.project_id,workflow.tool_performance.user_id,workflow.tool_performance.step_id,count(*),

round(sum(execution_time)/1000) as Sum_time_sec,round(((round(sum(execution_time)/1000))/60)/count(*),2) as Efficency_Min,round

(((round(sum(execution_time)/1000)))/count(*),2)

as Efficency_sec FROM workflow.project,workflow.tool_performance,workflow.evidence_to_do WHERE workflow.project

.project_id = workflow.tool_performance.project_id AND insert_time >'2017-05-01' AND insert_time <'2017-05-02' AND

workflow.evidence_to_do.status_id in (15100,15150,15200,15300,15400,15500) Group BY date_trunc('day', insert_time),workflow.project.project_name,

workflow.tool_performance.project_id,workflow.tool_performance.user_id,workflow.tool_performance.step_id ORDER BY

workflow.tool_performance.project_id,workflow.project.project_name,

workflow.tool_performance.step_id;

 

The following indexes are created on project & evidence_to_do table.

 

"CREATE INDEX project_id_idx ON workflow.project USING btree (project_id)"

"CREATE INDEX evidence_to_do_status_id_index ON workflow.evidence_to_do USING btree (status_id)"

 

 

Explain plan of the Query is:

 

"GroupAggregate  (cost=18675703613.60..20443753075.67 rows=6689718 width=69)"

"  ->  Sort  (cost=18675703613.60..18823015982.33 rows=58924947492 width=69)"

"        Sort Key: tool_performance.project_id, project.project_name, tool_performance.step_id, (date_trunc('day'::text, tool_performance.insert_time)), tool_performance.user_id"

"        ->  Nested Loop  (cost=2.42..884042104.67 rows=58924947492 width=69)"

"              ->  Seq Scan on evidence_to_do  (cost=0.00..118722.17 rows=554922 width=0)"

"                    Filter: (status_id = ANY ('{15100,15150,15200,15300,15400,15500}'::bigint[]))"

"              ->  Materialize  (cost=2.42..49435.58 rows=106186 width=69)"

"                    ->  Hash Join  (cost=2.42..48904.65 rows=106186 width=69)"

"                          Hash Cond: (tool_performance.project_id = project.project_id)"

"                          ->  Seq Scan on tool_performance  (cost=0.00..47442.18 rows=106186 width=39)"

"                                Filter: ((insert_time > '2017-05-01 00:00:00+05:30'::timestamp with time zone) AND (insert_time < '2017-05-02 00:00:00+05:30'::timestamp with time zone))"

"                          ->  Hash  (cost=1.63..1.63 rows=63 width=38)"

"                                ->  Seq Scan on project  (cost=0.00..1.63 rows=63 width=38)"

 


you will get a so-called cross join with 106186 rows from tool_performance multiplied with 554922
rows from evidence_to_do, resulting in 58.924.947.492 rows in total. Is that really what you want?

I think, there is a missing join-condition. It would be better to use expliciet JOIN-Syntax to prevent such errors.


Regards, Andreas
-- 
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com

Re: [PERFORM] Query is very much slow

От
Andreas Kretschmer
Дата:
Hi,

there is a similar question from Dinesh.Chandra@cyient.com, but it is
not exact the same query.
[PERFORM] Query is running very slow......, some hours ago.


Regards, Andreas

--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com