Обсуждение: [PERFORM] query performance issue

Поиск
Список
Период
Сортировка

[PERFORM] query performance issue

От
Samir Magar
Дата:
Hello,
I am having performance issues with one of the query.
The query is taking 39 min to fetch 3.5 mil records.

I want to reduce that time to 15 mins. 
could you please suggest something to its performance?

server configuration:
 CPUs = 4
memory = 16 GM
shared_buffers = 3 GB
work_mem = 100MB
effective_cache_size = 12 GB

we are doing the vacuum/analyze regularly on the database. 

attached is the query with its explain plan.

Thanks,
Samir Magar  
Вложения

Re: [PERFORM] query performance issue

От
Pavel Stehule
Дата:
Hi

please send EXPLAIN ANALYZE output.

Regards

Pavel

2017-11-15 10:33 GMT+01:00 Samir Magar <samirmagar8@gmail.com>:
Hello,
I am having performance issues with one of the query.
The query is taking 39 min to fetch 3.5 mil records.

I want to reduce that time to 15 mins. 
could you please suggest something to its performance?

server configuration:
 CPUs = 4
memory = 16 GM
shared_buffers = 3 GB
work_mem = 100MB
effective_cache_size = 12 GB

we are doing the vacuum/analyze regularly on the database. 

attached is the query with its explain plan.

Thanks,
Samir Magar  


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] query performance issue

От
Samir Magar
Дата:
please find the EXPLAIN ANALYZE output.

On Wed, Nov 15, 2017 at 3:13 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

please send EXPLAIN ANALYZE output.

Regards

Pavel

2017-11-15 10:33 GMT+01:00 Samir Magar <samirmagar8@gmail.com>:
Hello,
I am having performance issues with one of the query.
The query is taking 39 min to fetch 3.5 mil records.

I want to reduce that time to 15 mins. 
could you please suggest something to its performance?

server configuration:
 CPUs = 4
memory = 16 GM
shared_buffers = 3 GB
work_mem = 100MB
effective_cache_size = 12 GB

we are doing the vacuum/analyze regularly on the database. 

attached is the query with its explain plan.

Thanks,
Samir Magar  


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



Вложения

Re: [PERFORM] query performance issue

От
Pavel Stehule
Дата:


2017-11-15 13:54 GMT+01:00 Samir Magar <samirmagar8@gmail.com>:
please find the EXPLAIN ANALYZE output.

On Wed, Nov 15, 2017 at 3:13 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
Hi

please send EXPLAIN ANALYZE output.

Regards

Pavel

2017-11-15 10:33 GMT+01:00 Samir Magar <samirmagar8@gmail.com>:
Hello,
I am having performance issues with one of the query.
The query is taking 39 min to fetch 3.5 mil records.

I want to reduce that time to 15 mins. 
could you please suggest something to its performance?

server configuration:
 CPUs = 4
memory = 16 GM
shared_buffers = 3 GB
work_mem = 100MB
effective_cache_size = 12 GB

we are doing the vacuum/analyze regularly on the database. 

attached is the query with its explain plan.



There is wrong plan due wrong estimation

for this query you should to penalize nested loop

set enable_nestloop to off;

before evaluation of this query


Thanks,
Samir Magar  


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




Re: [PERFORM] query performance issue

От
Justin Pryzby
Дата:
On Wed, Nov 15, 2017 at 03:03:39PM +0530, Samir Magar wrote:
> I am having performance issues with one of the query.
> The query is taking 39 min to fetch 3.5 mil records.
> 
> I want to reduce that time to 15 mins.
> could you please suggest something to its performance?

> "HashAggregate  (cost=4459.68..4459.69 rows=1 width=27) (actual time=2890035.403..2892173.601 rows=3489861 loops=1)"

Looks to me like the problem is here:

> "  ->  Index Only Scan using idxdq7 on dlr_qlfy  (cost=0.43..4.45 ROWS=1 width=16) (actual time=0.009..0.066 ROWS=121
loops=103987)"
> "        Index Cond: ((qlfy_grp_id = dlr_grp.dlr_grp_id) AND (qlf_flg = 'N'::bpchar) AND (cog_grp_id =
dlr_grp_dlr_xref_1.dlr_grp_id))"
> "        Heap Fetches: 0"

Returning 100x more rows than expected and bubbling up through a cascade of
nested loops.

Are those 3 conditions independent ?  Or, perhaps, are rows for which
"qlfy_grp_id=dlr_grp.dlr_grp_id" is true always going to have
"cog_grp_id = dlr_grp_dlr_xref_1.dlr_grp_id" ?

Even if it's not "always" true, if rows which pass the one condition are more
likely to pass the other condition, this will cause an underestimate, as
obvserved.

You can do an experiment SELECTing just from those two tables joined and see if
you can reproduce the problem with poor rowcount estimate (hopefully in much
less than 15min).

If you can't drop one of the two conditions, you can make PG treat it as a
single condition for purpose of determining expected selectivity, using a ROW()
comparison like:

ROW(qlfy_grp_id, cog_grp_id) = ROW(dlr_grp.dlr_grp_id, dlr_grp_dlr_xref_1.dlr_grp_id)

If you're running PG96+ you may also be able to work around this by adding FKs.

Justin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] query performance issue

От
Gunther
Дата:

On 11/15/2017 8:12, Pavel Stehule wrote:
There is wrong plan due wrong estimation

for this query you should to penalize nested loop

set enable_nestloop to off;

before evaluation of this query

You are not the only one with this issue. May I suggest to look at this thread a little earlier this month.

http://www.postgresql-archive.org/OLAP-reporting-queries-fall-into-nested-loops-over-seq-scans-or-other-horrible-planner-choices-tp5990160.html

where this has been discussed in some length.

regards,
-Gunther


Re: [PERFORM] query performance issue

От
Pavel Stehule
Дата:


2017-11-15 20:58 GMT+01:00 Gunther <raj@gusw.net>:

On 11/15/2017 8:12, Pavel Stehule wrote:
There is wrong plan due wrong estimation

for this query you should to penalize nested loop

set enable_nestloop to off;

before evaluation of this query

You are not the only one with this issue. May I suggest to look at this thread a little earlier this month.

http://www.postgresql-archive.org/OLAP-reporting-queries-fall-into-nested-loops-over-seq-scans-or-other-horrible-planner-choices-tp5990160.html

where this has been discussed in some length.

It is typical issue. The source of these problems are correlations between columns (it can be fixed partially by multicolumn statistics in PostgreSQL 10). Another problem is missing multi table statistics - PostgreSQL planner expects so any value from dictionary has same probability, what is not usually true. Some OLAP techniques like calendar tables has usually very bad impact on estimations with this results.

Regards

Pavel


regards,
-Gunther