Обсуждение: Help me in reducing the CPU cost for the high cost query below, asit is hitting production seriously!!

Поиск
Список
Период
Сортировка
Hi all,

Hope my mail finds you in good time. I had a problem with a query which is
hitting the production seriously.
The below is the sub part of the query for which I cannot reduce the CPU
cost. 

Please check and verify whether I'm doing wrong or whether that type index
type suits it or not. 

Kindly help me resolve this issue.

*Query*:

explain select sum(CASE
                             WHEN MOD(cast(effort_hours as decimal),1) =
0.45 THEN
                             cast(effort_hours as int)+0.75
                             ELSE
                                 CASE
                                 WHEN MOD(cast(effort_hours as decimal),1) =
0.15 THEN
                                 cast(effort_hours as int) + 0.25
                                
                              ELSE
                                 CASE
                                 WHEN MOD(cast(effort_hours as decimal),1) =
0.30 THEN
                                 cast(effort_hours as int) + 0.5
                                
                              ELSE
                                 CASE
                                 WHEN MOD(cast(effort_hours as decimal),1) =
0 THEN
                                 cast(effort_hours as int) 
                                 end
                                 END
                                 END
                             END) from tms_timesheet_details, tms_wsr_header
header  where wsr_header_id=header.id and work_order_no != 'CORPORATE';

                                         QUERY PLAN
---------------------------------------------------------------------------------------------
 Aggregate  (cost=9868.91..9868.92 rows=1 width=8)
   ->  Hash Join  (cost=608.27..5647.67 rows=70354 width=8)
         Hash Cond: (tms_timesheet_details.wsr_header_id = header.id)
         ->  Seq Scan on tms_timesheet_details  (cost=0.00..3431.14
rows=72378 width=12)
               Filter: ((work_order_no)::text <> 'CORPORATE'::text)
         ->  Hash  (cost=399.23..399.23 rows=16723 width=4)
               ->  Seq Scan on tms_wsr_header header  (cost=0.00..399.23
rows=16723 width=4)
(7 rows)


The count of number of rows in the tables used are:

1) tms_timesheet_details:

amp_test=# select count(*) from tms_timesheet_details;
 count
--------
 110411
(1 row)

2) tms_wsr_header:

amp_test=# select count(*) from tms_wsr_header;
 count
-------
 16723
(1 row)


The details of the tables and the columns used are as below:

1) tms_timesheet_details:

amp_test=# \d tms_timesheet_details
                                          Table
"public.tms_timesheet_details"
       Column        |            Type             |                            
Modifiers

---------------------+-----------------------------+--------------------------------------------------------------------
 id                  | integer                     | not null default
nextval('tms_timesheet_details_id_seq'::regclass)
 status              | character varying           |
 create_uid          | integer                     |
 effort_hours        | double precision            |
 work_order_no       | character varying           |
 res_employee_id     | character varying           |
 wsr_header_id       | integer                     |
 remarks             | character varying           |
 write_date          | timestamp without time zone |
 timesheet_header_id | integer                     |
 date                | date                        |
 create_date         | timestamp without time zone |
 write_uid           | integer                     |
 release_no          | character varying           |
 project_id          | character varying           |
 loc_name            | character varying           |
 user_id             | integer                     |
 ao_emp_id           | character varying           |
Indexes:
    "tms_timesheet_details_pkey" PRIMARY KEY, btree (id)
    "tms_timesheet_details_uniq_res_employee_id_efforts" UNIQUE, btree
(res_employee_id, work_order_no, release_no, date, project_id)
    "timesheet_detail_inx" btree (wsr_header_id, timesheet_header_id)
    "ts_detail_date_idx" btree (date)
    "ts_detail_hdr_id_idx" btree (timesheet_header_id)
    "ts_detail_release_no_idx" btree (release_no)
    "work_order_no_idx" btree (work_order_no)
Foreign-key constraints:
    "tms_timesheet_details_create_uid_fkey" FOREIGN KEY (create_uid)
REFERENCES res_users(id) ON DELETE SET NULL
    "tms_timesheet_details_timesheet_header_id_fkey" FOREIGN KEY
(timesheet_header_id) REFERENCES tms_timesheet_header(id) ON DELETE SET NULL
    "tms_timesheet_details_user_id_fkey" FOREIGN KEY (user_id) REFERENCES
res_users(id) ON DELETE SET NULL
    "tms_timesheet_details_write_uid_fkey" FOREIGN KEY (write_uid)
REFERENCES res_users(id) ON DELETE SET NULL
    "tms_timesheet_details_wsr_header_id_fkey" FOREIGN KEY (wsr_header_id)
REFERENCES tms_wsr_header(id) ON DELETE SET NULL


2) tms_wsr_header:

amp_test=# \d tms_wsr_header
                                          Table "public.tms_wsr_header"
       Column        |            Type             |                         
Modifiers
---------------------+-----------------------------+-------------------------------------------------------------
 id                  | integer                     | not null default
nextval('tms_wsr_header_id_seq'::regclass)
 create_uid          | integer                     |
 status_id           | integer                     |
 ao_emp_name         | character varying           |
 ao_emp_id           | character varying           |
 res_employee_id     | character varying           |
 comments            | text                        |
 write_uid           | integer                     |
 write_date          | timestamp without time zone |
 create_date         | timestamp without time zone |
 timesheet_period_id | integer                     |
 user_id             | integer                     |
Indexes:
    "tms_wsr_header_pkey" PRIMARY KEY, btree (id)
    "res_employee_idx" btree (res_employee_id)
    "tmesheet_perd_idx" btree (timesheet_period_id)
Foreign-key constraints:
    "tms_wsr_header_create_uid_fkey" FOREIGN KEY (create_uid) REFERENCES
res_users(id) ON DELETE SET NULL
    "tms_wsr_header_status_id_fkey" FOREIGN KEY (status_id) REFERENCES
tms_timesheet_status(id) ON DELETE SET NULL
    "tms_wsr_header_timesheet_period_id_fkey" FOREIGN KEY
(timesheet_period_id) REFERENCES tms_timesheet_period(id) ON DELETE SET NULL
    "tms_wsr_header_user_id_fkey" FOREIGN KEY (user_id) REFERENCES
res_users(id) ON DELETE SET NULL
    "tms_wsr_header_write_uid_fkey" FOREIGN KEY (write_uid) REFERENCES
res_users(id) ON DELETE SET NULL
Referenced by:
    TABLE "tms_release_allocation_comments" CONSTRAINT
"tms_release_allocation_comments_wsr_header_id_fkey" FOREIGN KEY
(wsr_header_id) REFERENCES tms_wsr_header(id) ON DELETE SET NULL
    TABLE "tms_timesheet_details" CONSTRAINT
"tms_timesheet_details_wsr_header_id_fkey" FOREIGN KEY (wsr_header_id)
REFERENCES tms_wsr_header(id) ON DELETE SET NULL
    TABLE "tms_workflow_history" CONSTRAINT
"tms_workflow_history_wsr_id_fkey" FOREIGN KEY (wsr_id) REFERENCES
tms_wsr_header(id) ON DELETE SET NULL


Hope the above information is sufficient. Kindly show me a way to reduce the
cost of this query ASAP.

Thanks in advance.

Regards,
Pavan




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


pavan95 wrote
> Hi all,
> 
> Hope my mail finds you in good time. I had a problem with a query which is
> hitting the production seriously.
> The below is the sub part of the query for which I cannot reduce the CPU
> cost. 
> 
> Please check and verify whether I'm doing wrong or whether that type index
> type suits it or not. 
> 
> Kindly help me resolve this issue.
> 
> *Query*:
> 
> explain select sum(CASE
>                              WHEN MOD(cast(effort_hours as decimal),1) =
> 0.45 THEN
>                              cast(effort_hours as int)+0.75
>                              ELSE
>                                  CASE
>                                  WHEN MOD(cast(effort_hours as decimal),1)
> =
> 0.15 THEN
>                                  cast(effort_hours as int) + 0.25
>                                 
>                               ELSE
>                                  CASE
>                                  WHEN MOD(cast(effort_hours as decimal),1)
> =
> 0.30 THEN
>                                  cast(effort_hours as int) + 0.5
>                                 
>                               ELSE
>                                  CASE
>                                  WHEN MOD(cast(effort_hours as decimal),1)
> =
> 0 THEN
>                                  cast(effort_hours as int) 
>                                  end
>                                  END
>                                  END
>                              END) from tms_timesheet_details,
> tms_wsr_header
> header  where wsr_header_id=header.id and work_order_no != 'CORPORATE';
> 
> 
> 
> --
> Sent from:
> http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html

To start with you can try re-writing this so that it only does the mod cast
once. e.g:
sum ( 
CASE MOD(cast(effort_hours as decimal),1)
    WHEN 0.45 THEN cast(effort_hours as int)+0.75
    WHEN 0.15 THEN cast(effort_hours as int)+0.25
    WHEN 0.30 THEN cast(effort_hours as int)+0.5
    WHEN 0 THEN cast(effort_hours as int)
END
)




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Hi mlunon,

A great thanks for your timely response. And yes it worked when I rewritten
the query.

The query got enhanced with approximate of 1000 planner seeks. You can find
it from the explain plan below:

amp_test=# explain select
sum (
CASE MOD(cast(effort_hours as decimal),1)
        WHEN 0.45 THEN cast(effort_hours as int)+0.75
        WHEN 0.15 THEN cast(effort_hours as int)+0.25
        WHEN 0.30 THEN cast(effort_hours as int)+0.5
        WHEN 0 THEN cast(effort_hours as int)
END
)
from tms_timesheet_details detail , tms_wsr_header header  where
wsr_header_id=header.id and work_order_no != 'CORPORATE';
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Aggregate  (cost=8813.60..8813.61 rows=1 width=8)
   ->  Hash Join  (cost=608.27..5647.67 rows=70354 width=8)
         Hash Cond: (detail.wsr_header_id = header.id)
         ->  Seq Scan on tms_timesheet_details detail  (cost=0.00..3431.14
rows=72378 width=12)
               Filter: ((work_order_no)::text <> 'CORPORATE'::text)
         ->  Hash  (cost=399.23..399.23 rows=16723 width=4)
               ->  Seq Scan on tms_wsr_header header  (cost=0.00..399.23
rows=16723 width=4)
(7 rows)


But is this the optimum, can we reduce the cost more at least to around 5000
planner seeks. As it is only a subpart of the query which is called multiple
number of times in the main query.

And to send the main query along with tables description and explain plan it
will be a vast message so send you a sub-part.

Please help me to tune it more. Thanks in Advance.

Regards,
Pavan



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Hi. Basically you want to convert a base 60 number to a decimal. So you don't need conditionals. See if this works for you:

SELECT floor(effort_hours) + ( (effort_hours - floor(effort_hours)) / 0.6 )
from tms_timesheet_details detail , tms_wsr_header header  where
wsr_header_id=header.id and work_order_no != 'CORPORATE';

Regards,
Abbas

On Mon, May 21, 2018 at 3:43 PM, pavan95 <pavan.postgresdba@gmail.com> wrote:
Hi mlunon, A great thanks for your timely response. And yes it worked when I rewritten the query. The query got enhanced with approximate of 1000 planner seeks. You can find it from the explain plan below: amp_test=# explain select sum ( CASE MOD(cast(effort_hours as decimal),1) WHEN 0.45 THEN cast(effort_hours as int)+0.75 WHEN 0.15 THEN cast(effort_hours as int)+0.25 WHEN 0.30 THEN cast(effort_hours as int)+0.5 WHEN 0 THEN cast(effort_hours as int) END ) from tms_timesheet_details detail , tms_wsr_header header where wsr_header_id=header.id and work_order_no != 'CORPORATE'; QUERY PLAN ------------------------------------------------------------------------------------------------ Aggregate (cost=8813.60..8813.61 rows=1 width=8) -> Hash Join (cost=608.27..5647.67 rows=70354 width=8) Hash Cond: (detail.wsr_header_id = header.id) -> Seq Scan on tms_timesheet_details detail (cost=0.00..3431.14 rows=72378 width=12) Filter: ((work_order_no)::text <> 'CORPORATE'::text) -> Hash (cost=399.23..399.23 rows=16723 width=4) -> Seq Scan on tms_wsr_header header (cost=0.00..399.23 rows=16723 width=4) (7 rows) But is this the optimum, can we reduce the cost more at least to around 5000 planner seeks. As it is only a subpart of the query which is called multiple number of times in the main query. And to send the main query along with tables description and explain plan it will be a vast message so send you a sub-part. Please help me to tune it more. Thanks in Advance. Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Hi Abbas,

Thanks for your valuable suggestions. To my surprise I got the same output
as what I have executed before. 

But unfortunately I'm unable to understand the logic of the code, in
specific what is base 60 number? The used data type for "effort_hours"
column is 'double precision'. 

Kindly help me in understanding the logic. Thanks in advance.

Regards,
Pavan



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


On Mon, May 21, 2018 at 6:39 AM, pavan95 <pavan.postgresdba@gmail.com> wrote:
Hi Abbas,

Thanks for your valuable suggestions. To my surprise I got the same output
as what I have executed before.

But unfortunately I'm unable to understand the logic of the code, in
specific what is base 60 number? The used data type for "effort_hours"
column is 'double precision'.

Kindly help me in understanding the logic. Thanks in advance.

This is not converting a "base 60 number to base 10" - this is computing a percentage, which is indeed what you want to do.

Since 0.60 is the maximum value of the fraction in this encoding scheme dividing the actual value by 0.60 tells you what percentage (between 0 and 1) your value is of the maximum.  But you have to get rid of the hours component first, and floor truncates the minutes leaving just the hours which you can subtract out from the original leaving only the minutes.
David J.​

P.S. ​You could consider adding a new column to the table, along with a trigger, and compute and store the derived value upon insert.
Sure thing. Base 60 or Sexagesimal is the numerical system used for measuring time (1 hour equals to 60 minutes and so on). But this case is even simpler, so without going into much detail about bases, you're mapping between two sets of numbers:

0 -> 0
.15 -> .25
.30 -> .50
.45 -> .75

From working with clocks, we know that 15 minutes is .25 hours, 30 minutes is .5 hours and so on. So you only need to divide the fractional part ( effort_hours - floor(effort_hours) ) by .6 to get what you want.

For example, let's say effort_hours = 1.15; then floor(1.15) is 1; so:

floor(1.15) + ( (1.15 - floor(1.15)) / 0.6 ) = 1 + ( (1.15 - 1) / 0.6 ) = 1 + ( 0.15 / 0.60 ) = 1.25

Hope it helps. Feel free to ask a question if it's still unclear. :)


On Mon, May 21, 2018 at 6:09 PM, pavan95 <pavan.postgresdba@gmail.com> wrote:
Hi Abbas, Thanks for your valuable suggestions. To my surprise I got the same output as what I have executed before. But unfortunately I'm unable to understand the logic of the code, in specific what is base 60 number? The used data type for "effort_hours" column is 'double precision'. Kindly help me in understanding the logic. Thanks in advance. Regards, Pavan -- Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Hi abbas,

Thank you so much.  I've got this query from my development team asking to
improve its performance. 

Now I got pretty much clear idea of it. And it will be the final extent to
which we can tune the performance right?

If there is still a way give me some tips to enhance the query performance. 

But kudos for your "floor" function. After a long struggle with the indexes,
joins and the hints I came to know that there is also a way to tune the
query performance by rewriting the query.

Thanks in advance.

Regards,
Pavan



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Hi David,

Thank you so much for your valuable inputs.  Is there anything that I need
to look from Indexes perspective or Join order ??

Kindly let me know if  it can be tuned further.

Thank you very much.  


Regards,
Pavan



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


On Mon, May 21, 2018 at 7:43 AM, pavan95 <pavan.postgresdba@gmail.com> wrote:
Hi David,

Thank you so much for your valuable inputs.  Is there anything that I need
to look from Indexes perspective or Join order ??

Kindly let me know if  it can be tuned further.

What I've got to give here is what you've received.

David J.
pavan95 wrote
> *Query*:
> 
> explain select ... from tms_timesheet_details, tms_wsr_header header  
> where wsr_header_id=header.id and work_order_no != 'CORPORATE';
> 
>                                          QUERY PLAN
> ---------------------------------------------------------------------------------------------
>  Aggregate  (cost=9868.91..9868.92 rows=1 width=8)
>    ->  Hash Join  (cost=608.27..5647.67 rows=70354 width=8)
>          Hash Cond: (tms_timesheet_details.wsr_header_id = header.id)
>          ->  Seq Scan on tms_timesheet_details  (cost=0.00..3431.14
> rows=72378 width=12)
>                Filter: ((work_order_no)::text <> 'CORPORATE'::text)
>          ->  Hash  (cost=399.23..399.23 rows=16723 width=4)
>                ->  Seq Scan on tms_wsr_header header  (cost=0.00..399.23
> rows=16723 width=4)
> (7 rows)
> 
> 
> --
> Sent from:
> http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html

Why is the table tms_wsr_header in the from clause as it is not used in the
select columns? A simple "wsr_header_id is not null" would do the same as
this is a foreign key into the tms_wsr_header table. An index with on
tms_timesheet_details.id "where wsr_header_id is not null" might then speed
the query up if there were significant numbers of rows with a null
wsr_header_id.
Cheers
Matthew



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Hi all, 
Thank you so much for your valuable responses.Tried every aspect which you
have said for my sub-query.  
I hoped a better decrease in cost for my main query. But yes it decreased
but not to a great extent.
What I felt is to provide the main query and the associated table
definitions in the query. Please help me to tune the following big query. 
select res.id id,
                          row_number() OVER () as sno,
                           res.header_id,
                           res.emp_id,
                           res.alias alias,
                           res.name as name,
                           res.billed_hrs billed_hrs,
                           res.unbilled_hrs unbilled_hrs,
                           res.paid_time_off paid_time_off,
                           res.unpaid_leave unpaid_leave,
                           res.breavement_time breavement_time,
                           res.leave leave,
                           res.state,
                           count(*) OVER() AS full_count,
                           res.header_emp_id,
                           res.header_status
                             from (
            select 
                history.id as id,
                0 as header_id,
                '0' as emp_id,
                 row_number() OVER () as sno,
                user1.alias_id as alias,
                partner.name as name,
                ( select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
work_order_no != 'CORPORATE') billed_hrs,
                    
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
release_no = 'unbillable_time') as unbilled_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
release_no = 'paid_time_off') as paid_time_off,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
release_no = 'unpaid_leave') as unpaid_leave,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
release_no = 'bereavement_time') as breavement_time,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and date
>='2018-04-16' and date <='2018-04-30' and release_no in
('sick_leave','casual_leave','privilege_leave','optional_holiday') ) as
leave,
                (case when tl_status.state = '' then 'Waiting for approval'
else tl_status.state end) as state,
                header.res_employee_id as header_emp_id,
                status.name as header_status     
                from tms_workflow_history history, 
                    res_users users,
                    res_users user1,
                    res_partner partner,
                    tms_timesheet_status status,
                    tms_timesheet_header header
                    left join tms_workflow_history tl_status on
tl_status.timesheet_id=header.id
                                                             and
tl_status.active=True
                                                             and
tl_status.group_id=13
                    
                where 
                     history.timesheet_id=header.id
                    and header.res_employee_id=user1.res_employee_id
                    and  status.id=header.status_id
                    and history.user_id=users.id
                    and user1.partner_id=partner.id
                    and header.timesheet_period_id = 127
                     and (history.state = 'Approved' )
                    and history.current_activity='N'
                    and history.is_final_approver=True 
                    and history.active = True
   union 
            select 
                0 as id,
                header.id as header_id,
                '0' as emp_id,
                 0 as sno,
                users.alias_id as alias,
                partner.name as name,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where work_order_no != 'CORPORATE' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) billed_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'unbillable_time' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as unbilled_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'paid_time_off' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as paid_time_off,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'unpaid_leave' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as unpaid_leave,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'bereavement_time' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as breavement_time,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where res_employee_id=users.res_employee_id
and  date >='2018-04-16' and date <='2018-04-30' and release_no in
('sick_leave','casual_leave','privilege_leave','optional_holiday') ) as
leave,
                'Not Submitted' state,
                header.res_employee_id as header_emp_id,
                'Not Submitted' as header_status     
                from res_users users,
                    res_partner partner,
                    tms_timesheet_status status,
                    tms_timesheet_header header         
                where 
                    header.res_employee_id=users.res_employee_id
                    and  status.id=header.status_id
                    and users.partner_id=partner.id
                    and status.name='Draft'
                    and header.timesheet_period_id=127
                    and header.res_employee_id in (some ids)         
   union    
            select
                0 as id,
                0 as header_id,
                users.res_employee_id as emp_id,
                 0 as sno,
                users.alias_id as alias,
                partner.name as name,
                0 as billed_hrs,
                0 as unbilled_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'paid_time_off' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as paid_time_off,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'unpaid_leave' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as unpaid_leave,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'bereavement_time' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as breavement_time,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where res_employee_id=users.res_employee_id
and date >='2018-04-16' and date <='2018-04-30' and release_no in
('sick_leave','casual_leave','privilege_leave','optional_holiday') ) as
leave,
                'Not Submitted' state,
                users.res_employee_id as header_emp_id,
                'Not Submitted' as header_status
                
            from res_users users,
                res_partner partner
            
            where users.res_employee_id not in (select res_employee_id
                                                from 
                                                    tms_timesheet_header
                                                where 
                                                    timesheet_period_id=127
                                                    and res_employee_id in
('A1','B1','C2323',--some 2000 id's))
                and users.partner_id=partner.id
                 and users.res_employee_id is not null
                and users.res_employee_id in ('A1','B1','C2323',--some 2000
id's)
         order by name ) res  order by name limit 10 offset 0

Note: As it is a big query posted only a meaningful part. There 5 unions of
similar type and same are the tables involved in the entire query.

Sample query plan: 
Limit  (cost=92129.35..92129.63 rows=10 width=248)
   ->  WindowAgg  (cost=92129.35..92138.46 rows=331 width=248)
         ->  Subquery Scan on res  (cost=92129.35..92133.49 rows=331
width=248)
               ->  Sort  (cost=92129.35..92130.18 rows=331 width=33)
                     Sort Key: partner.name
                     ->  HashAggregate  (cost=92112.19..92115.50 rows=331
width=33)
                           ->*  Append  (cost=340.02..92099.78 rows=331
width=33)*
                                 ->  WindowAgg  (cost=340.02..1591.76 rows=1
width=54)
                            

(396 rows)
Problem started with append in the plan.

Please help me tune this query!!!!

Thanks in Advance.

Regards,
Pavan




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Hi all, 
Thank you so much for your valuable responses.Tried every aspect which you
have said for my sub-query.  
I hoped a better decrease in cost for my main query. But yes it decreased
but not to a great extent.
What I felt is to provide the main query and the associated table
definitions in the query. Please help me to tune the following big query. 
select res.id id,
                          row_number() OVER () as sno,
                           res.header_id,
                           res.emp_id,
                           res.alias alias,
                           res.name as name,
                           res.billed_hrs billed_hrs,
                           res.unbilled_hrs unbilled_hrs,
                           res.paid_time_off paid_time_off,
                           res.unpaid_leave unpaid_leave,
                           res.breavement_time breavement_time,
                           res.leave leave,
                           res.state,
                           count(*) OVER() AS full_count,
                           res.header_emp_id,
                           res.header_status
                             from (
            select 
                history.id as id,
                0 as header_id,
                '0' as emp_id,
                 row_number() OVER () as sno,
                user1.alias_id as alias,
                partner.name as name,
                ( select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
work_order_no != 'CORPORATE') billed_hrs,
                    
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
release_no = 'unbillable_time') as unbilled_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
release_no = 'paid_time_off') as paid_time_off,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
release_no = 'unpaid_leave') as unpaid_leave,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and
release_no = 'bereavement_time') as breavement_time,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where timesheet_header_id=header.id and date
>='2018-04-16' and date <='2018-04-30' and release_no in
('sick_leave','casual_leave','privilege_leave','optional_holiday') ) as
leave,
                (case when tl_status.state = '' then 'Waiting for approval'
else tl_status.state end) as state,
                header.res_employee_id as header_emp_id,
                status.name as header_status     
                from tms_workflow_history history, 
                    res_users users,
                    res_users user1,
                    res_partner partner,
                    tms_timesheet_status status,
                    tms_timesheet_header header
                    left join tms_workflow_history tl_status on
tl_status.timesheet_id=header.id
                                                             and
tl_status.active=True
                                                             and
tl_status.group_id=13
                    
                where 
                     history.timesheet_id=header.id
                    and header.res_employee_id=user1.res_employee_id
                    and  status.id=header.status_id
                    and history.user_id=users.id
                    and user1.partner_id=partner.id
                    and header.timesheet_period_id = 127
                     and (history.state = 'Approved' )
                    and history.current_activity='N'
                    and history.is_final_approver=True 
                    and history.active = True
   union 
            select 
                0 as id,
                header.id as header_id,
                '0' as emp_id,
                 0 as sno,
                users.alias_id as alias,
                partner.name as name,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where work_order_no != 'CORPORATE' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) billed_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'unbillable_time' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as unbilled_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'paid_time_off' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as paid_time_off,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'unpaid_leave' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as unpaid_leave,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'bereavement_time' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as breavement_time,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where res_employee_id=users.res_employee_id
and  date >='2018-04-16' and date <='2018-04-30' and release_no in
('sick_leave','casual_leave','privilege_leave','optional_holiday') ) as
leave,
                'Not Submitted' state,
                header.res_employee_id as header_emp_id,
                'Not Submitted' as header_status     
                from res_users users,
                    res_partner partner,
                    tms_timesheet_status status,
                    tms_timesheet_header header         
                where 
                    header.res_employee_id=users.res_employee_id
                    and  status.id=header.status_id
                    and users.partner_id=partner.id
                    and status.name='Draft'
                    and header.timesheet_period_id=127
                    and header.res_employee_id in (some ids)         
   union    
            select
                0 as id,
                0 as header_id,
                users.res_employee_id as emp_id,
                 0 as sno,
                users.alias_id as alias,
                partner.name as name,
                0 as billed_hrs,
                0 as unbilled_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'paid_time_off' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as paid_time_off,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'unpaid_leave' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as unpaid_leave,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'bereavement_time' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as breavement_time,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where res_employee_id=users.res_employee_id
and date >='2018-04-16' and date <='2018-04-30' and release_no in
('sick_leave','casual_leave','privilege_leave','optional_holiday') ) as
leave,
                'Not Submitted' state,
                users.res_employee_id as header_emp_id,
                'Not Submitted' as header_status
                
            from res_users users,
                res_partner partner
            
            where users.res_employee_id not in (select res_employee_id
                                                from 
                                                    tms_timesheet_header
                                                where 
                                                    timesheet_period_id=127
                                                    and res_employee_id in
('A1','B1','C2323',--some 2000 id's))
                and users.partner_id=partner.id
                 and users.res_employee_id is not null
                and users.res_employee_id in ('A1','B1','C2323',--some 2000
id's)
         order by name ) res  order by name limit 10 offset 0

Note: As it is a big query posted only a meaningful part. There 5 unions of
similar type and same are the tables involved in the entire query.

Sample query plan: 
Limit  (cost=92129.35..92129.63 rows=10 width=248)
   ->  WindowAgg  (cost=92129.35..92138.46 rows=331 width=248)
         ->  Subquery Scan on res  (cost=92129.35..92133.49 rows=331
width=248)
               ->  Sort  (cost=92129.35..92130.18 rows=331 width=33)
                     Sort Key: partner.name
                     ->  HashAggregate  (cost=92112.19..92115.50 rows=331
width=33)
                           ->*  Append  (cost=340.02..92099.78 rows=331
width=33)*
                                 ->  WindowAgg  (cost=340.02..1591.76 rows=1
width=54)
                            

(396 rows)
Problem started with append in the plan.

Please help me tune this query!!!!

Thanks in Advance.

Regards,
Pavan




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


On Tue, May 22, 2018 at 03:32:59AM -0700, pavan95 wrote:
> Sample query plan: 
> Limit  (cost=92129.35..92129.63 rows=10 width=248)

Would you send the output of explain(analyze,buffers) for the whole query ?
And/or paste it into explain.depesz site and send a link.

Justin


Hi Justin,

Please find the output of explain(analyze,buffers) for the whole query in
the below link.

Link:  https://explain.depesz.com/s/dNkb <https://explain.depesz.com/s/dNkb>  

Thanks in Advance!


Regards,
Pavan



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


On Tue, May 22, 2018 at 03:51:44AM -0700, pavan95 wrote:
> Please find the output of explain(analyze,buffers) for the whole query in
> the below link.

> Seq Scan on res_users users (cost=750.92..1,836.69 rows=249 width=15) (actual time=3.962..17.544 rows=67 loops=1) 

Not sure but would you try creating an index on:
res_users.res_employee_id

> Seq Scan on res_users user1 (cost=0.00..58.03 rows=1,303 width=15) (actual time=0.002..0.002 rows=1 loops=1)

Also the planner's estimate for table:res_users is off by 1300x..so you should
probably vacuum analyze it then recheck.  I don't think we know what version
postgres you have, but last week's patch releases include a fix which may be
relevant (reltuples including dead tuples).

Also I don't know the definition of this table or its indices:
tms_workflow_history

..but it looks like an additional or modified index or maybe clustering the
table on existing index might help (active? is_final_approver?)
Or maybe this should be 3 separate indices rather than composite index?
Perhaps some of those could be BRIN indices, depending on postgres version

Justin


Thanks a lot!  I will have a look

On Tue, May 22, 2018, 11:53 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
On Tue, May 22, 2018 at 03:51:44AM -0700, pavan95 wrote:
> Please find the output of explain(analyze,buffers) for the whole query in
> the below link.

> Seq Scan on res_users users (cost=750.92..1,836.69 rows=249 width=15) (actual time=3.962..17.544 rows=67 loops=1)

Not sure but would you try creating an index on:
res_users.res_employee_id

> Seq Scan on res_users user1 (cost=0.00..58.03 rows=1,303 width=15) (actual time=0.002..0.002 rows=1 loops=1)

Also the planner's estimate for table:res_users is off by 1300x..so you should
probably vacuum analyze it then recheck.  I don't think we know what version
postgres you have, but last week's patch releases include a fix which may be
relevant (reltuples including dead tuples).

Also I don't know the definition of this table or its indices:
tms_workflow_history

..but it looks like an additional or modified index or maybe clustering the
table on existing index might help (active? is_final_approver?)
Or maybe this should be 3 separate indices rather than composite index?
Perhaps some of those could be BRIN indices, depending on postgres version

Justin
Hi all/Justin,

As said, created index on the res_users.res_employee_id and the below link
is the explain plan result.

Link:  https://explain.depesz.com/s/hoct <http://>  .

And the cost of Previous query is 92,129 and the cost of current modified
query after creating the above said index is 91,462. But good thing is we
can see a very small improvement..!. 

Please find the table definitions which are used in the query(which you
asked for tms_worflow_history).

1. tms_timesheet_details:

amp_test=# \d tms_timesheet_details
                                          Table
"public.tms_timesheet_details"
       Column        |            Type             |                            
Modifiers

---------------------+-----------------------------+--------------------------------------------------------------------
 id                  | integer                     | not null default
nextval('tms_timesheet_details_id_seq'::regclass)
 status              | character varying           |
 create_uid          | integer                     |
 effort_hours        | double precision            |
 work_order_no       | character varying           |
 res_employee_id     | character varying           |
 wsr_header_id       | integer                     |
 remarks             | character varying           |
 write_date          | timestamp without time zone |
 timesheet_header_id | integer                     |
 date                | date                        |
 create_date         | timestamp without time zone |
 write_uid           | integer                     |
 release_no          | character varying           |
 project_id          | character varying           |
 loc_name            | character varying           |
 user_id             | integer                     |
 ao_emp_id           | character varying           |
Indexes:
    "tms_timesheet_details_pkey" PRIMARY KEY, btree (id)
    "tms_timesheet_details_uniq_res_employee_id_efforts" UNIQUE, btree
(res_employee_id, work_order_no, release_no, date, project_id)
    "timesheet_detail_inx" btree (wsr_header_id, timesheet_header_id)
    "tms_timesheet_details_all_idx" btree (wsr_header_id, work_order_no,
release_no, date, effort_hours)
    "tms_timesheet_details_id_idx" btree (id) WHERE wsr_header_id IS NOT
NULL
    "ts_detail_date_idx" btree (date)
    "ts_detail_hdr_id_idx" btree (timesheet_header_id)
    "ts_detail_release_no_idx" btree (release_no)
    "work_order_no_idx" btree (work_order_no)


2. tms_workflow_history:

amp_test=# \d tms_workflow_history
                                         Table "public.tms_workflow_history"
      Column       |            Type             |                            
Modifiers
-------------------+-----------------------------+-------------------------------------------------------------------
 id                | integer                     | not null default
nextval('tms_workflow_history_id_seq'::regclass)
 create_uid        | integer                     |
 current_activity  | character varying           |
 user_id           | integer                     |
 sequence          | integer                     |
 is_final_approver | boolean                     |
 wsr_id            | integer                     |
 write_uid         | integer                     |
 timesheet_id      | integer                     |
 state             | character varying           |
 write_date        | timestamp without time zone |
 remarks           | character varying           |
 create_date       | timestamp without time zone |
 group_id          | integer                     |
 active            | boolean                     |
Indexes:
    "tms_workflow_history_pkey" PRIMARY KEY, btree (id)
    "curract_state_isfinal_app_idx" btree (current_activity, state,
is_final_approver)
    "timesheet_id_group_id_active_idx" btree (timesheet_id, group_id,
active)
    "tms_wkf_his_active_is_final_approveridx" btree (active,
is_final_approver)
    "tms_wkf_his_group_id_idx" btree (group_id)
    "tms_wkf_his_timesheet_id_idx" btree (timesheet_id)
    "tms_wkf_hist_current_activity_idx" btree (current_activity)
    "tms_wkf_hist_state_idx" btree (state)
    "wsr_id_idx" btree (wsr_id)

3. res_users:

                                         Table "public.res_users"
      Column       |            Type             |                      
Modifiers
-------------------+-----------------------------+--------------------------------------------------------
 id                | integer                     | not null default
nextval('res_users_id_seq'::regclass)
 active            | boolean                     | default true
 login             | character varying           | not null
 password          | character varying           |
 company_id        | integer                     | not null
 partner_id        | integer                     | not null
 create_date       | timestamp without time zone |
 share             | boolean                     |
 write_uid         | integer                     |
 create_uid        | integer                     |
 action_id         | integer                     |
 write_date        | timestamp without time zone |
 signature         | text                        |
 password_crypt    | character varying           |
 res_employee_name | character varying           |
 res_employee_id   | character varying           |
 role              | character varying           |
 skills            | character varying           |
 holiday_header_id | integer                     |
 alias_id          | character varying           |
 loc_name          | character varying           |
Indexes:
    "res_users_pkey" PRIMARY KEY, btree (id)
    "res_users_login_key" UNIQUE, btree (login)
    "res_users_res_employee_id_idx" btree (res_employee_id)

4. res_partner:

amp_test=# \d res_partner
                                            Table "public.res_partner"
         Column          |            Type             |                       
Modifiers
-------------------------+-----------------------------+----------------------------------------------------------
 id                      | integer                     | not null default
nextval('res_partner_id_seq'::regclass)
 name                    | character varying           |
 company_id              | integer                     |
 comment                 | text                        |
 website                 | character varying           |
 create_date             | timestamp without time zone |
 color                   | integer                     |
 active                  | boolean                     |
 street                  | character varying           |
 supplier                | boolean                     |
 city                    | character varying           |
 display_name            | character varying           |
 zip                     | character varying           |
 title                   | integer                     |
 country_id              | integer                     |
 commercial_company_name | character varying           |
 parent_id               | integer                     |
 company_name            | character varying           |
 employee                | boolean                     |
 ref                     | character varying           |
 email                   | character varying           |
 is_company              | boolean                     |
 function                | character varying           |
 lang                    | character varying           |
 fax                     | character varying           |
 street2                 | character varying           |
 barcode                 | character varying           |
 phone                   | character varying           |
 write_date              | timestamp without time zone |
 date                    | date                        |
 tz                      | character varying           |
 write_uid               | integer                     |
 customer                | boolean                     |
 create_uid              | integer                     |
 credit_limit            | double precision            |
 user_id                 | integer                     |
 mobile                  | character varying           |
 type                    | character varying           |
 partner_share           | boolean                     |
 vat                     | character varying           |
 state_id                | integer                     |
 commercial_partner_id   | integer                     |
Indexes:
    "res_partner_pkey" PRIMARY KEY, btree (id)
    "res_partner_commercial_partner_id_index" btree (commercial_partner_id)
    "res_partner_company_id_index" btree (company_id)
    "res_partner_date_index" btree (date)
    "res_partner_display_name_index" btree (display_name)
    "res_partner_name_index" btree (name)
    "res_partner_parent_id_index" btree (parent_id)
    "res_partner_ref_index" btree (ref)
Check constraints:
    "res_partner_check_name" CHECK (type::text = 'contact'::text AND name IS
NOT NULL OR type::text <> 'contact'::text)

5. tms_timesheet_status

amp_test=# \d tms_timesheet_status
                                      Table "public.tms_timesheet_status"
   Column    |            Type             |                            
Modifiers
-------------+-----------------------------+-------------------------------------------------------------------
 id          | integer                     | not null default
nextval('tms_timesheet_status_id_seq'::regclass)
 status      | character varying           |
 create_uid  | integer                     |
 description | text                        |
 sequence    | integer                     |
 write_uid   | integer                     |
 write_date  | timestamp without time zone |
 create_date | timestamp without time zone |
 name        | character varying           |
Indexes:
    "tms_timesheet_status_pkey" PRIMARY KEY, btree (id)

6. tms_timesheet_header:

                                          Table
"public.tms_timesheet_header"
       Column        |            Type             |                            
Modifiers

---------------------+-----------------------------+-------------------------------------------------------------------
 id                  | integer                     | not null default
nextval('tms_timesheet_header_id_seq'::regclass)
 create_uid          | integer                     |
 status_id           | integer                     |
 ao_emp_name         | character varying           |
 ao_emp_id           | character varying           |
 over                | double precision            |
 res_employee_id     | character varying           |
 regular_pay_hours   | double precision            |
 write_uid           | integer                     |
 comments            | text                        |
 write_date          | timestamp without time zone |
 under               | double precision            |
 create_date         | timestamp without time zone |
 timesheet_period_id | integer                     |
 user_id             | integer                     |
Indexes:
    "tms_timesheet_header_pkey" PRIMARY KEY, btree (id)
    "tms_timesheet_header_uniq_tms_emp_status" UNIQUE, btree
(res_employee_id, timesheet_period_id)


7. tms_timesheet_period:

                                         Table "public.tms_timesheet_period"
      Column       |            Type             |                            
Modifiers
-------------------+-----------------------------+-------------------------------------------------------------------
 id                | integer                     | not null default
nextval('tms_timesheet_period_id_seq'::regclass)
 status            | character varying           |
 create_uid        | integer                     |
 auto_approve_date | timestamp without time zone |
 name              | character varying           |
 end_date          | date                        |
 auto_submit_date  | timestamp without time zone |
 period_type       | character varying           |
 write_date        | timestamp without time zone |
 payhours          | integer                     |
 remarks           | text                        |
 create_date       | timestamp without time zone |
 write_uid         | integer                     |
 start_date        | date                        |
Indexes:
    "tms_timesheet_period_pkey" PRIMARY KEY, btree (id)

Note: Due to space constraint I'm unable to mention the foreign key
constraints and referenced by for the tables(thinking it is not required)

I have also observed that based on the composite indexes on the columns of
tms_workflow_history table the cost came to 91,462 orelse because of
individual indexes it remains unaltered from 92,129.  

I want to reduce the query cost. As observed in the plan a Subquery Scan is
taking around 45000 planner seeks at one place and 38000 planner seeks. Is
there any way to reduce this cost ? 

Or any other measures to be followed. My current postgresql version is 9.5.
Thanks in Advance!


Regards,
Pavan




--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Hi Pavan,
that's quite a big query. I can see that the generate_series function is
getting repeatedly called and the planner estimates for this sub query are
out by a factor of 66. You might try to re-write using a WITH query. I am
assuming that you have already analyzed all the tables and also added
appropriate indexes on join/query columns.
Regards
Matthew



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


Hi Matthew,

Yeah and you said right!. I have analyzed the entire database and also
created appropriate indexes for the columns used in WHERE/JOIN clauses.

Okay I will just provide the fourth union part of the query which you can
analyze easier(this not that big).

Please find the query part. And refer to the table definitions in my
previous posts.
Query:

select 
                0 as id,
                header.id as header_id,
                '0' as emp_id,
                 0 as sno,
                users.alias_id as alias,
                partner.name as name,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where work_order_no != 'CORPORATE' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) billed_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'unbillable_time' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as unbilled_hrs,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'paid_time_off' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as paid_time_off,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'unpaid_leave' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as unpaid_leave,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where release_no = 'bereavement_time' and
res_employee_id=users.res_employee_id and date in (select date::date from
generate_series((select start_date from tms_timesheet_period where
id=127),(select end_date from tms_timesheet_period where id=127),'1
day'::interval) date)) as breavement_time,
                (select SUM( floor(effort_hours) + ( (effort_hours -
floor(effort_hours)) / 0.6 ))
                    from tms_timesheet_details where res_employee_id=users.res_employee_id
and  date >='2018-04-16' and date <='2018-04-30' and release_no in
('sick_leave','casual_leave','privilege_leave','optional_holiday') ) as
leave,
                'Not Submitted' state,
                header.res_employee_id as header_emp_id,
                'Not Submitted' as header_status
                
                from res_users users,
                    res_partner partner,
                    tms_timesheet_status status,
                    tms_timesheet_header header
                    
                where 
                    header.res_employee_id=users.res_employee_id
                    and  status.id=header.status_id
                    and users.partner_id=partner.id
                    and status.name='Draft'
                    and header.timesheet_period_id=127
                    and header.res_employee_id in ('14145', '14147',
'ON-14148', '11331', '11332', '11333', 'ON-11334', '65432', '65416',
'54643', '23266', '4681', '56464', '64649', '89564', '98798', '13333',
'44466', '87852', '65464', '65464', '44655', '8201', '65465', 'ON-78785',
'13233', 'ON-5544', 'ON-54654', '23131', '98765', '25134', '13218', '84645',
'4687', '6546', '4988', '89796', '79878', '7198', '15726', '2132', '5310',
'13056', '4446', '16825', '16740', '3912', '19601', '13200', '12981',
'ON-3332', '13166', 'ON-3144', 'ON-1251', 'ON-2799', 'ON-2338', '7286',
'ON-2381', 'ON-3102', 'ON-2938', '64782', '5407', '54641', '46379',
'G151151', '5007', '6011', '5050', '20869', '20204', '12410', '10488',
'14582', '13574', '12982', '7884', '7788', '13417', '7922', '16744',
'16746', '16756', '8292', '16745', '19989', '8297', '5020', '14184',
'17161', '20767', '20753', '20289', '19979', '19975', '20272', '4292',
'G9341010', '14791', '5121', 'ON-1767', 'ON-581', 'ON-700', 'ON-437',
'ON-562', 'ON-1726', 'OFF-1060', 'ON-147', 'OFF-612', 'OFF-635', 'OFF-857',
'ON-900280', 'ON-1934', 'ON-1922', 'ON-2258', 'OFF-2537', 'ON-2872',
'ON-2450', 'ON-2265', 'OFF-2900', 'ON-2551', 'ON-1867', 'ON-2086',
'ON-2348', 'OFF-2706', 'ON-2244', 'ON-2134', 'ON-2654', 'ON-2346',
'ON-1984', 'ON-1243', 'OFF-1266', 'ON-1276', 'ON-2452', 'ON-2179',
'ON-2931', 'ON-2164', 'ON-2468', 'ON-1473', 'ON-1481', 'ON-1521', 'ON-2455',
'ON-2104', 'ON-2295', 'ON-1540', 'ON-900501', 'ON-1351', 'OFF-1364',
'ON-2704', 'ON-1757', 'ON-1690', 'ON-1670', 'ON-1671', 'ON-1689', 'ON-1704',
'ON-1714', 'ON-1655', 'ON-1709', 'ON-1737', 'ON-1725', 'ON-1750', 'ON-1731',
'ON-1715', 'ON-1745', 'ON-1751', 'ON-2191', 'OFF-2686', 'ON-1815',
'ON-2052', 'ON-2019', 'ON-1820', 'ON-1717', 'ON-1713', 'ON-1661',
'OFF-1664', 'ON-1703', 'ON-1734', 'ON-1735', 'ON-1656', 'ON-1705',
'ON-1733', 'ON-1708', 'ON-1666', 'ON-1667', 'ON-1658', 'ON-900487',
'ON-900214', 'ON-1676', 'ON-2378', 'ON-1654', 'ON-2417', 'ON-1488',
'ON-1500', 'ON-1506', 'ON-2875', 'ON-1531', 'ON-2099', 'ON-2195', 'ON-2038',
'ON-1490', 'ON-1489', 'ON-1501', 'ON-1627', 'ON-1929', 'ON-900431',
'ON-1462', 'ON-1466', 'OFF-1468', 'ON-1420', 'ON-1479', 'ON-900543',
'ON-1485', 'ON-1493', 'ON-2347', 'ON-1499', 'ON-2324', 'ON-2733', 'ON-1736',
'ON-1720', 'ON-1674', 'ON-1849', 'ON-1836', 'ON-1846', 'ON-2140',
'OFF-2856', 'ON-2128', 'OFF-2524', 'ON-1845', 'ON-2336', 'ON-1945',
'ON-2008', 'ON-1900', 'ON-2117', 'ON-1837', 'ON-2199', 'ON-2200', 'ON-1821',
'ON-2060', 'ON-1804', 'ON-1803', 'ON-2364', 'ON-2068', 'ON-2474', 'ON-1895',
'ON-1838', 'ON-2024', 'ON-2653', 'ON-1621', 'OFF-1145', 'OFF-994',
'OFF-999', 'ON-1003', 'ON-812', 'OFF-1033', 'ON-1048', 'OFF-1058',
'ON-1053', 'ON-1071', 'ON-1088', 'ON-256', 'ON-207', 'ON-206', 'ON-184',
'OFF-268', 'ON-285', 'OFF-286', 'ON-649', 'ON-301', 'OFF-645', 'ON-338',
'OFF-323', 'ON-347', 'ON-351', 'ON-350', 'ON-354', 'ON-719', 'ON-723',
'ON-137', 'ON-112', 'ON-141', 'ON-752', 'ON-791', 'OFF-802', 'OFF-822',
'ON-573', 'ON-616', 'OFF-587', 'ON-641', 'ON-664', 'ON-336', 'OFF-676',
'ON-687', 'ON-695', 'ON-439', 'ON-406', 'ON-659', 'OFF-890', 'ON-900',
'ON-935', 'ON-228', 'ON-942', 'ON-954', 'OFF-957', 'ON-961', 'ON-830',
'OFF-966', 'OFF-969', 'OFF-951', 'ON-1043', 'OFF-1042', 'ON-1055',
'ON-1109', 'ON-2212', 'ON-2036', 'OFF-1221', 'ON-1238', 'ON-1331',
'OFF-1353', 'ON-1343', 'ON-2014', 'ON-1995', 'ON-2133', 'OFF-2189',
'ON-1581', 'OFF-1595', 'ON-1556', 'ON-1580', 'OFF-1591', 'ON-2437',
'ON-900466', 'ON-1611', 'OFF-1612', 'ON-1624', 'ON-2765', 'ON-1927',
'ON-2361', 'ON-2054', 'ON-1633', 'ON-1503', 'OFF-2546', 'ON-1512',
'ON-1536', 'ON-2543', 'ON-2558', 'ON-2237', 'ON-1535', 'ON-2436',
'OFF-1547', 'ON-2380', 'ON-2116', 'ON-2820', 'ON-1563', 'ON-900512',
'ON-1568', 'ON-1570', 'ON-900514', 'ON-1130', 'ON-1632', 'ON-2359',
'ON-3176', 'ON-2132', 'ON-2012', 'ON-1762', 'ON-900230', 'ON-2299',
'ON-3552', 'ON-2557', 'ON-2129', 'ON-1918', 'OFF-2552', 'ON-2235',
'OFF-2773', 'ON-2123', 'ON-2658', 'ON-1866', 'ON-2506', 'OFF-2703',
'ON-2882', 'ON-2649', 'ON-2997', 'ON-1925', 'OFF-3096', 'ON-3297',
'ON-3359', 'ON-3352', 'ON-3357', 'ON-3378', 'ON-3071', 'OFF-2702',
'ON-2801', 'ON-2689', 'ON-2416', 'ON-3305', 'OFF-2695', 'ON-2069',
'ON-3318', 'OFF-3681', 'ON-1541', 'ON-2248', 'ON-2249', 'ON-2250',
'ON-2259', 'ON-2280', 'ON-3345', 'OFF-3545', 'ON-2286', 'ON-2293',
'ON-2277', 'ON-1180', 'ON-2304', 'OFF-3575', 'OFF-2384', 'OFF-2513',
'ON-2444', 'OFF-3218', 'ON-2497', 'ON-2708', 'ON-2774', 'ON-2667',
'ON-2803', 'OFF-3044', 'ON-2290', 'ON-2791', 'ON-2810', 'ON-2767',
'ON-2415', 'ON-2489', 'ON-2180', 'ON-2131', 'ON-2207', 'ON-2233', 'ON-3045',
'ON-3675', 'ON-2260', 'ON-2700', 'ON-2418', 'ON-2924', 'OFF-2828',
'ON-2536', 'ON-3127', 'ON-2472', 'ON-2482', 'ON-3098', 'ON-2473', 'ON-3073',
'ON-2855', 'OFF-2709', 'ON-2789', 'ON-2589', 'ON-2409', 'ON-3455',
'OFF-3556', 'ON-2510', 'ON-3120', 'ON-2457', 'ON-2303', 'ON-2044',
'ON-2313', 'ON-2326', 'ON-2312', 'OFF-2391', 'ON-2438', 'OFF-3548',
'ON-2581', 'ON-2525', 'ON-2538', 'ON-2433', 'ON-3300', 'ON-2487', 'ON-2754',
'OFF-3049', 'ON-2370', 'ON-3151', 'ON-3100', 'ON-3101', 'ON-1044',
'ON-2431', 'ON-2371', 'ON-2714', 'OFF-3544', 'OFF-2388', 'ON-2790',
'OFF-2918', 'ON-2681', 'ON-2512', 'ON-2511', 'ON-2521', 'OFF-2539',
'ON-3551', 'OFF-3549', 'OFF-3462', 'ON-2745', 'ON-2778', 'OFF-2821',
'ON-900498', 'ON-2812', 'OFF-2955', 'ON-2840', 'ON-2847', 'ON-3309',
'OFF-2917', 'OFF-2857', 'ON-2795', 'ON-2793', 'ON-2796', 'ON-2873',
'ON-2874', 'OFF-2870', 'ON-2889', 'ON-2719', 'ON-2824', 'ON-2861',
'ON-2865', 'ON-2866', 'OFF-2826', 'OFF-2898', 'ON-3301', 'OFF-2961',
'ON-2878', 'OFF-2886', 'ON-2914', 'ON-2909', 'OFF-2906', 'ON-2922',
'OFF-3682', 'ON-2937', 'ON-2913', 'OFF-2916', 'ON-2923', 'OFF-3006',
'OFF-3046', 'OFF-3042', 'OFF-3050', 'OFF-2642', 'ON-3093', 'ON-2685',
'OFF-3112', 'ON-3576', 'OFF-3094', 'OFF-3126', 'ON-3129', 'ON-3152',
'ON-3153', 'ON-3171', 'ON-3177', 'ON-3217', 'ON-2617', 'ON-3654', 'ON-3677',
'ON-1817', 'ON-3684', 'ON-3686', 'ON-3685', 'ON-3278', 'ON-3317', 'ON-3316',
'ON-3325', 'ON-3349', 'ON-3351', 'ON-3391', 'ON-3398', 'ON-3451', 'ON-3414',
'ON-3452', 'ON-3412', 'ON-3453', 'ON-3417', 'OFF-3473', 'ON-3457',
'ON-3523', 'ON-3546', 'ON-3554', 'ON-3553', 'ON-900552', 'G12941370',
'6479', '14192', '87546', '19755', '16751', '2095', '12244', '12363',
'17510', '19935', '7973', '13189', '19733', '19928', '21124', '16725',
'7244', '3027', '11426', '12732', '8530', '10301', '19555', '19706',
'20097', '13156', '14690', '4183', '8340', '18026', '12297', '6577',
'11301', '12980', '18138', '5603', '17587', '19118', '12210', '7292',
'17577', '16578', '7895', '200186', '20100', '34541', '19370', '11111',
'1492', '1111', '2556', '3445643643', '20379', 'ON-2338P', '20899')


And the explain plan for the above query can be found in the below link.
Link: https://explain.depesz.com/s/y3J8 <http://>  

Please help me tune this query or logic to rewrite at the painful area in
the query.

Thanks in Advance!

Regards,
Pavan









--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


On Wed, May 23, 2018 at 12:01:06AM -0700, pavan95 wrote:
> As said, created index on the res_users.res_employee_id and the below link
> is the explain plan result.
> 
> Link:  https://explain.depesz.com/s/hoct
> 
> And the cost of Previous query is 92,129 and the cost of current modified
> query after creating the above said index is 91,462. But good thing is we

Forget the cost - that's postgres *model* of the combined IO+CPU.
If the model is off, that's may cause bad plans and could be looked into
further.

In any case, that index cut your runtime from 75sec to 60sec (in spite of the
modelled cost).

It looks like you resolved the bad estimate on the users table?

> 2. tms_workflow_history:
> Indexes:
>     "tms_workflow_history_pkey" PRIMARY KEY, btree (id)
>     "curract_state_isfinal_app_idx" btree (current_activity, state, is_final_approver)
>     "timesheet_id_group_id_active_idx" btree (timesheet_id, group_id, active)
>     "tms_wkf_his_active_is_final_approveridx" btree (active, is_final_approver)
>     "tms_wkf_his_group_id_idx" btree (group_id)
>     "tms_wkf_his_timesheet_id_idx" btree (timesheet_id)
>     "tms_wkf_hist_current_activity_idx" btree (current_activity)
>     "tms_wkf_hist_state_idx" btree (state)
>     "wsr_id_idx" btree (wsr_id)

How big is the table ?  And curract_state_isfinal_app_idx ?
Have these been reindexed (or pg_repacked) recently?

It seems to me that the remaining query optimization is to improve this:
> Bitmap Heap Scan on tms_workflow_history history (cost=193.19..1,090.50 rows=6,041 width=12) (actual
time=3.692..15.714rows=11,351 loops=1)
 

I think you could consider clustering (or repacking) the table on
curract_state_isfinal_app_idx (but you'll have to judge if that's okay and
won't negatively affect other queries).

But, what's your target runtime ?  Improvements here could cut at most 15sec
off the total 60sec.  If you're hoping to save more than that, you'll need to
(also) look further than the query:

 - postgres parameters: what are shared_buffers, work_mem, effective_cache_size ?
   + https://wiki.postgresql.org/wiki/Server_Configuration
 - are there other DBs/applications running on the server/VM ?
 - kernel tuning (readahead, VM parameters, write cache, scheduler, THP, etc)
 - server hardware (what OS? storage? RAM? filesystem?)
 - how does the storage perform outside of postgres?
   + something like this: /usr/sbin/bonnie++ -f -n0 -x4 -d /var/lib/pgsql

Justin


Hi Justin,

>How big is the table ?  And curract_state_isfinal_app_idx ? 
>Have these been reindexed (or pg_repacked) recently? 

The size of the table 'tms_workflow_history' is 7600Kb(which is pretty
small). Yes those indexes were dropped and recreated. 

>It looks like you resolved the bad estimate on the users table? 
Yeah, even I think the same.

Please find the explain plan which got increased again vastly. Is this
because of the increase in rows?

Link :  https://explain.depesz.com/s/Ifr <http://>  

The above is the explain plan taken from production server. And this is the
main plan to tune.

Please let me know the where I'm going wrong. Thank you in Advance.!!

Regards,
Pavan



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


On Wed, May 23, 2018 at 07:03:18AM -0700, pavan95 wrote:
> Please find the explain plan which got increased again vastly. Is this
> because of the increase in rows?
> 
> Link :  https://explain.depesz.com/s/Ifr <http://>  

That's explain without "analyze", so not very useful.

There's handful of questions:

On Wed, May 23, 2018 at 08:43:22AM -0500, Justin Pryzby wrote:
>  - postgres parameters: what are shared_buffers, work_mem, effective_cache_size ?
>    + https://wiki.postgresql.org/wiki/Server_Configuration
>  - are there other DBs/applications running on the server/VM ?
>  - kernel tuning (readahead, VM parameters, write cache, scheduler, THP, etc)
>  - server hardware (what OS? storage? RAM? filesystem?)
>  - how does the storage perform outside of postgres?
>    + something like this: /usr/sbin/bonnie++ -f -n0 -x4 -d /var/lib/pgsql

Justin


Hi Justin,

Please find the below explain plan link.

Link:  https://explain.depesz.com/s/owE <http://>  


Any help is appreciated. Thanks in Advance.


Regards,
Pavan



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html


On Wed, May 23, 2018 at 10:20:42PM -0700, pavan95 wrote:
> Hi Justin,
> 
> Please find the below explain plan link.
> 
> Link:  https://explain.depesz.com/s/owE <http://>  

That's explain analyze but explain(analyze,buffers) is better.

Is this on a completely different server than the previous plans ?

This rowcount misestimate appears to be a significant part of the problem:
 Merge Join (cost=228.77..992.11 ROWS=20 width=22) (actual time=4.353..12.439 ROWS=343 loops=1)
    Merge Cond: (history_2.timesheet_id = header_2.id)

You could look at the available stats for that table's column in pg_stats.
Is there an "most common values" list?
Maybe you need to ALTER TABLE .. SET STATISTICS 999 (or some increased value)
and re-analyze ?

You can see these are also taking large component of the query time:

 Bitmap Index Scan on ts_detail_release_no_idx (cost=0.00..33.86 rows=1,259 width=0) (actual time=0.304..0.304
rows=1,331LOOPS=327)
 
    Index Cond: ((release_no)::text = 'paid_time_off'::text)
...
 Bitmap Index Scan on ts_detail_release_no_idx (cost=0.00..33.86 rows=1,259 width=0) (actual time=0.304..0.304
rows=1,331LOOPS=343)
 
    Index Cond: ((release_no)::text = 'paid_time_off'::text)

I wonder whether it would help to
CREATE INDEX ON tms_timesheet_details(timesheet_header_id) WHERE
((release_no)::text = 'paid_time_off'::text);

In addition to the other settings I asked about, it might be interesting to
SHOW effective_io_concurrency;

You're at the point where I can't reasonably contribute much more.

Justin