Re: FW: Re: [PERFORM] Query is running very slow......

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: FW: Re: [PERFORM] Query is running very slow......
Дата
Msg-id acba2082-c299-e712-4964-29c47f839a57@2ndquadrant.com
обсуждение исходный текст
Ответ на FW: Re: [PERFORM] Query is running very slow......  (Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>)
Ответы Re: FW: Re: [PERFORM] Query is running very slow......  (Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>)
Список pgsql-performance

On 5/25/17 2:26 PM, Dinesh Chandra 12108 wrote:
> Hi Tomas,
>
> Please find the below input for slow query.
>
> (a) something about the hardware it's running on
>   RAM-->64 GB, CPU->40core
>
> (b) amounts of data in the tables / databases
> Database size     :32GB
> -----------------
> Tables size
> -----------------
> Workflow.project        : 8194 byte
> workflow.tool_performance    :175 MB
> workflow.evidence_to_do    :580 MB
>
> (c) EXPLAIN or even better EXPLAIN ANALYZE of the query
>
> "GroupAggregate  (cost=16583736169.63..18157894828.18 rows=5920110 width=69)"
> "  ->  Sort  (cost=16583736169.63..16714893857.43 rows=52463075120 width=69)"
> "        Sort Key: tool_performance.project_id, project.project_name, tool_performance.step_id,
(date_trunc('day'::text,tool_performance.insert_time)), tool_performance.user_id" 
> "        ->  Nested Loop  (cost=2.42..787115179.07 rows=52463075120 width=69)"
> "              ->  Seq Scan on evidence_to_do  (cost=0.00..119443.95 rows=558296 width=0)"
> "                    Filter: (status_id = ANY ('{15100,15150,15200,15300,15400,15500}'::bigint[]))"
> "              ->  Materialize  (cost=2.42..49843.24 rows=93970 width=69)"
> "                    ->  Hash Join  (cost=2.42..49373.39 rows=93970 width=69)"
> "                          Hash Cond: (tool_performance.project_id = project.project_id)"
> "                          ->  Seq Scan on tool_performance  (cost=0.00..48078.88 rows=93970 width=39)"
> "                                Filter: ((insert_time > '2017-05-01 00:00:00+05:30'::timestamp with time zone) AND
(insert_time< '2017-05-02 00:00:00+05:30'::timestamp with time zone))" 
> "                          ->  Hash  (cost=1.63..1.63 rows=63 width=38)"
> "                                ->  Seq Scan on project  (cost=0.00..1.63 rows=63 width=38)"
>

Are you sure this is the same query? The query you posted includes there
two join conditions:

     evidence_to_do.project_id = tool_performance.project_id
     evidence_to_do.project_id = project.project_id

But the plan only seems to enforce the equality between 'project' and
'tool_performance'. So when joining the evidence_to_do, it performs a
cartesian product, producing ~52B rows (estimated). That can't be fast.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


В списке pgsql-performance по дате отправления:

Предыдущее
От: Dinesh Chandra 12108
Дата:
Сообщение: FW: Re: [PERFORM] Query is running very slow......
Следующее
От: Alexandru Lazarev
Дата:
Сообщение: [PERFORM] Multicolumn B-Tree index - order by on 1st column and IN lookup for 2nd