Обсуждение: Performance of a Query

Поиск
Список
Период
Сортировка

Performance of a Query

От
"Kumar, Virendra"
Дата:

Hello Gurus,

 

I am struggling to tune a query which is doing join on top of aggregate for around 3 million rows. The plan and SQL is attached to the email.

 

Below is system Details:

 

PGSQL version – 10.1

OS – RHEL 3.10.0-693.5.2.el7.x86_64

Binary – Dowloaded from postgres.org compiled and installed.

Hardware – Virtual Machine with 8vCPU and 32GB of RAM, on XFS filesystem.

 

 

Please let me know if you need more information.

 

 

Regards,

Virendra




This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.
Вложения

Re: Performance of a Query

От
Scott Marlowe
Дата:
On Tue, Jan 9, 2018 at 2:18 PM, Kumar, Virendra
<Virendra.Kumar@guycarp.com> wrote:
> Hello Gurus,
>
> I am struggling to tune a query which is doing join on top of aggregate for
> around 3 million rows. The plan and SQL is attached to the email.
>
> Below is system Details:
>
> PGSQL version – 10.1
>
> OS – RHEL 3.10.0-693.5.2.el7.x86_64
>
> Binary – Dowloaded from postgres.org compiled and installed.
>
> Hardware – Virtual Machine with 8vCPU and 32GB of RAM, on XFS filesystem.

I uploaded your query plan here: https://explain.depesz.com/s/14r6

The most expensive part is the merge join at the end.

Lines like this one: "Buffers: shared hit=676 read=306596, temp
read=135840 written=135972"

Tell me that your sorts etc are spilling to disk, so the first thing
to try is upping work_mem a bit. Don't go crazy, as it can run your
machine out of memory if you do. but doubling or tripling it and
seeing the effect on the query performance is a good place to start.

The good news is that most of your row estimates are about right, so
the query planner is doing what it can to make the query fast, but I'm
guessing if you get the work_mem high enough it will switch from a
merge join to a hash_join or something more efficient for large
numbers of rows.


RE: Performance of a Query

От
"Kumar, Virendra"
Дата:
Thank you Scott!
I have current work_mem set as 4MB, shared_buffers to 8GB, hugepages on.
I gradually increased the work_mem to 1GB but it did not help a bit. Am I missing something obvious.

Regards,
Virendra
-----Original Message-----
From: Scott Marlowe [mailto:scott.marlowe@gmail.com]
Sent: Tuesday, January 09, 2018 5:08 PM
To: Kumar, Virendra
Cc: pgsql-performance@postgresql.org
Subject: Re: Performance of a Query

On Tue, Jan 9, 2018 at 2:18 PM, Kumar, Virendra <Virendra.Kumar@guycarp.com> wrote:
> Hello Gurus,
>
> I am struggling to tune a query which is doing join on top of
> aggregate for around 3 million rows. The plan and SQL is attached to the email.
>
> Below is system Details:
>
> PGSQL version – 10.1
>
> OS – RHEL 3.10.0-693.5.2.el7.x86_64
>
> Binary – Dowloaded from postgres.org compiled and installed.
>
> Hardware – Virtual Machine with 8vCPU and 32GB of RAM, on XFS filesystem.

I uploaded your query plan here: https://explain.depesz.com/s/14r6

The most expensive part is the merge join at the end.

Lines like this one: "Buffers: shared hit=676 read=306596, temp
read=135840 written=135972"

Tell me that your sorts etc are spilling to disk, so the first thing to try is upping work_mem a bit. Don't go crazy,
asit can run your machine out of memory if you do. but doubling or tripling it and seeing the effect on the query
performanceis a good place to start.
 

The good news is that most of your row estimates are about right, so the query planner is doing what it can to make the
queryfast, but I'm guessing if you get the work_mem high enough it will switch from a merge join to a hash_join or
somethingmore efficient for large numbers of rows.
 


________________________________

This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.

Re: Performance of a Query

От
Scott Marlowe
Дата:
On Tue, Jan 9, 2018 at 3:25 PM, Kumar, Virendra
<Virendra.Kumar@guycarp.com> wrote:
> Thank you Scott!
> I have current work_mem set as 4MB, shared_buffers to 8GB, hugepages on.
> I gradually increased the work_mem to 1GB but it did not help a bit. Am I missing something obvious.
> From: Scott Marlowe [mailto:scott.marlowe@gmail.com]
> Sent: Tuesday, January 09, 2018 5:08 PM
> To: Kumar, Virendra
> Cc: pgsql-performance@postgresql.org
> Subject: Re: Performance of a Query

Try it with something reasonable like 64MB and then post your query
plans to explain.depesz and then here and let's compare. Note that
some queries are just slow, and this one is handling a lot of data, so
there's only so much to do if an index won't fix it.


RE: Performance of a Query

От
"Kumar, Virendra"
Дата:
It did not seem to help.
See attachment.


Regards,
Virendra
-----Original Message-----
From: Scott Marlowe [mailto:scott.marlowe@gmail.com]
Sent: Tuesday, January 09, 2018 6:00 PM
To: Kumar, Virendra
Cc: pgsql-performance@postgresql.org
Subject: Re: Performance of a Query

On Tue, Jan 9, 2018 at 3:25 PM, Kumar, Virendra <Virendra.Kumar@guycarp.com> wrote:
> Thank you Scott!
> I have current work_mem set as 4MB, shared_buffers to 8GB, hugepages on.
> I gradually increased the work_mem to 1GB but it did not help a bit. Am I missing something obvious.
> From: Scott Marlowe [mailto:scott.marlowe@gmail.com]
> Sent: Tuesday, January 09, 2018 5:08 PM
> To: Kumar, Virendra
> Cc: pgsql-performance@postgresql.org
> Subject: Re: Performance of a Query

Try it with something reasonable like 64MB and then post your query plans to explain.depesz and then here and let's
compare.Note that some queries are just slow, and this one is handling a lot of data, so there's only so much to do if
anindex won't fix it.
 


________________________________

This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.

Вложения

Re: Performance of a Query

От
Scott Marlowe
Дата:
On Tue, Jan 9, 2018 at 4:09 PM, Kumar, Virendra
<Virendra.Kumar@guycarp.com> wrote:
> It did not seem to help.
> See attachment.

Yeah while it's still writing, it's about half as much but most of the
time seems to be in merging  etc multiple data sets. I'm wondering
what non-default values you might have set otherwise. Are you running
on SSDs? If so lowering random_page_cost might help, but again, this
might just be a very expensive query as well.


RE: Performance of a Query

От
Danylo Hlynskyi
Дата:
Can you try to extract filter part as CTE? Like

with filtered as (select ... where policyid = 123456)
select ... (here comes original query but uses filtered table instead)

10 янв. 2018 г. 1:10 пользователь "Kumar, Virendra" <Virendra.Kumar@guycarp.com> написал:
It did not seem to help.
See attachment.


Regards,
Virendra
-----Original Message-----
From: Scott Marlowe [mailto:scott.marlowe@gmail.com]
Sent: Tuesday, January 09, 2018 6:00 PM
To: Kumar, Virendra
Cc: pgsql-performance@postgresql.org
Subject: Re: Performance of a Query

On Tue, Jan 9, 2018 at 3:25 PM, Kumar, Virendra <Virendra.Kumar@guycarp.com> wrote:
> Thank you Scott!
> I have current work_mem set as 4MB, shared_buffers to 8GB, hugepages on.
> I gradually increased the work_mem to 1GB but it did not help a bit. Am I missing something obvious.
> From: Scott Marlowe [mailto:scott.marlowe@gmail.com]
> Sent: Tuesday, January 09, 2018 5:08 PM
> To: Kumar, Virendra
> Cc: pgsql-performance@postgresql.org
> Subject: Re: Performance of a Query

Try it with something reasonable like 64MB and then post your query plans to explain.depesz and then here and let's compare. Note that some queries are just slow, and this one is handling a lot of data, so there's only so much to do if an index won't fix it.


________________________________

This message is intended only for the use of the addressee and may contain
information that is PRIVILEGED AND CONFIDENTIAL.

If you are not the intended recipient, you are hereby notified that any
dissemination of this communication is strictly prohibited. If you have
received this communication in error, please erase all copies of the message
and its attachments and notify the sender immediately. Thank you.

Re: Performance of a Query

От
Laurenz Albe
Дата:
Scott Marlowe wrote:
> On Tue, Jan 9, 2018 at 2:18 PM, Kumar, Virendra
> <Virendra.Kumar@guycarp.com> wrote:
> > Hello Gurus,
> > 
> > I am struggling to tune a query which is doing join on top of aggregate for
> > around 3 million rows. The plan and SQL is attached to the email.
> > 
> > Below is system Details:
> > 
> > PGSQL version – 10.1
> > 
> > OS – RHEL 3.10.0-693.5.2.el7.x86_64
> > 
> > Binary – Dowloaded from postgres.org compiled and installed.
> > 
> > Hardware – Virtual Machine with 8vCPU and 32GB of RAM, on XFS filesystem.
> 
> I uploaded your query plan here: https://explain.depesz.com/s/14r6
> 
> The most expensive part is the merge join at the end.
> 
> Lines like this one: "Buffers: shared hit=676 read=306596, temp
> read=135840 written=135972"
> 
> Tell me that your sorts etc are spilling to disk, so the first thing
> to try is upping work_mem a bit. Don't go crazy, as it can run your
> machine out of memory if you do. but doubling or tripling it and
> seeing the effect on the query performance is a good place to start.
> 
> The good news is that most of your row estimates are about right, so
> the query planner is doing what it can to make the query fast, but I'm
> guessing if you get the work_mem high enough it will switch from a
> merge join to a hash_join or something more efficient for large
> numbers of rows.

Looking at the plan, I'd guess that the following index could be helpful:

CREATE INDEX ON ap.site_exposure(portfolio_id, peril_id, account_id);

Don't know how much it would buy you, but you could avoid the
sequential scan and the sort that way.

Yours,
Laurenz Albe