Обсуждение: Query very different speeds on seemingly similar data

Поиск
Список
Период
Сортировка

Query very different speeds on seemingly similar data

От
Rob Northcott
Дата:

I’ve got a weird problem that I can’t work out…

A customer was complaining that their system was running slowly.  They’ve only been using it for a few days, so we tested it on our copy of their data and it’s running at normal speed.

Uploaded our backup to the live server and it’s still quick.

Restored a new backup of their live database on our local test server and it’s REALLY slow.  So the problem seems to be data-related.

I’ve run Analyse, Vacuum and Reindex and still no change.

I’ve stripped the query down to the bare minimum that causes the speed difference.

The only difference between the “good” data and the “bad” data involved in this query is about 80 extra records in an invoice table (which had 250,000 records to start with).

 

It’s behaving (in my opinion) like it does if Analyse isn’t run after restoring data, or as if the indexes are broken.  Explain Analyse shows it is running the query in completely different ways on the two databases.

 

The query, now I’ve stripped it down to the offending part, is as follows:

SELECT stk_key,

(SELECT SUM(stdp_quantity) FROM sales_invoicedetails_purchlinks

                LEFT JOIN sales_invoicedetails ON std_unique = stdp_std_unique

                WHERE stdp_loc_key = '__NBI' AND std_stk_key = stock.stk_key

  ) as level

FROM stock

 

Table “stock” has about 5000 records, sales_invoicedetails has about 250,000, sales_invoicedetails_purchlinks has about 80

 

The bit that kills it is the “std_stk_key=stock_stk_key” in the sub-query.

 

On the “good” data it runs in less than 100 milliseconds, on the “bad” data it takes ten minutes!

 

Explain files attached (assuming attachments will get through to the group – otherwise what’s the best way to post it?)

 

I’m completely stumped – any suggestions most welcome!

 

Med vänlig hälsning / Best Regards

 

Rob Northcott

Software Developer (UK Office, formerly TEAM Systems)

 

Phone   +44 1752 712052

 

Compilator AB

Södergatan 22

SE-211 34 Malmö

Sweden

www.compilator.com

 

Asset 2@2x

 

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL AND IS THUS FOR USE ONLY BY THE INTENDED RECIPIENT. IF YOU RECEIVED THIS IN ERROR, PLEASE CONTACT THE SENDER AND DELETE THE E-MAIL AND ITS ATTACHMENTS FROM ALL COMPUTERS.

 

Вложения

Re: Query very different speeds on seemingly similar data

От
David Rowley
Дата:
On Wed, 5 Jun 2019 at 04:55, Rob Northcott <Rob.Northcott@compilator.com> wrote:
> Explain files attached (assuming attachments will get through to the group – otherwise what’s the best way to post
it?)

You're best to post the EXPLAIN ANALYZE output to
https://explain.depesz.com  what you attached is pretty horrible to
read and details are only gained by hovering the mouse cursor over the
node. I for one struggle to work out what's the inner and outer sides
of the join with the output you've given and can only work it out by
looking at the nested loop plan to see which side the parameter is on.

It appears that the sub-query plan has changed from a Nested Loop plan
to a Merge Join. The Merge Join is a pretty bad plan since the index
that provides the pre-sorted input must filter out many non-matching
rows. It's not quite clear to me why the planner chooses that index,
mostly because I'm too lazy to learn the output you've shown the plans
in, but if you did have an index on sales_invoicedetails (std_stk_key,
std_unique), then the Merge Join plan would likely produce a better
plan, or at least better than the current Merge Join plan.

You'll likely want to ensure that random_page_cost has not been set to
something insane on the Merge Join instance.  I'd also check
seq_page_cost too and also effective_cache_size.

More recommendations might be easier to give if you show the plans in
a better format.

Apart from that, you could consider also instead of performing a
sub-query, LEFT JOIN to a query similar to your subquery but after
removing the std_stk_key = stock.stk_key condition and adding a GROUP
BY std_stk_key. However, that may perform worse if there are many more
std_stk_key groups than there are matching rows in stock.stk_key.
Experimentation might be required there.


--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



RE: Query very different speeds on seemingly similar data

От
Rob Northcott
Дата:

>-----Original Message-----
>From: David Rowley <david.rowley@2ndquadrant.com> 
>Sent: 05 June 2019 01:48
>To: Rob Northcott <Rob.Northcott@compilator.com>
>Cc: pgsql-general@lists.postgresql.org
>Subject: Re: Query very different speeds on seemingly similar data

>On Wed, 5 Jun 2019 at 04:55, Rob Northcott <Rob.Northcott@compilator.com> wrote:
>> Explain files attached (assuming attachments will get through to the 
>> group – otherwise what’s the best way to post it?)

>You're best to post the EXPLAIN ANALYZE output to https://explain.depesz.com  what you attached is pretty horrible to
readand details are only gained by hovering the >mouse cursor over the node. I for one struggle to work out what's the
innerand outer sides of the join with the output you've given and can only work it out by looking >at the nested loop
planto see which side the parameter is on.
 

>It appears that the sub-query plan has changed from a Nested Loop plan to a Merge Join. The Merge Join is a pretty bad
plansince the index that provides the pre-sorted input must filter out many non-matching rows. It's not quite clear to
mewhy the planner chooses that index, mostly because I'm too lazy to learn the output you've shown the plans in, but if
youdid have an index on sales_invoicedetails (std_stk_key, std_unique), then the Merge Join plan would likely produce a
betterplan, or at least better than the current Merge Join plan.
 

You'll likely want to ensure that random_page_cost has not been set to something insane on the Merge Join instance.
I'dalso check seq_page_cost too and also effective_cache_size.
 

More recommendations might be easier to give if you show the plans in a better format.

Apart from that, you could consider also instead of performing a sub-query, LEFT JOIN to a query similar to your
subquerybut after removing the std_stk_key = stock.stk_key condition and adding a GROUP BY std_stk_key. However, that
mayperform worse if there are many more std_stk_key groups than there are matching rows in stock.stk_key.
 
Experimentation might be required there.


-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services





Thanks for the reply David,

Those output files came from PGAdmin, but I agree they're not the easiest to look at.  I've tried writing the query in
differentways (which is easy enough with the minimal query here but not so easy with the original massive query), but
it'sstill way slower on one set of data than on the other.
 

I'll try changing those settings on the test server and see if it makes any difference... We've got almost 100
databasesrunning the same system (on the same server) and this is the only one that's being a problem so far, but it
couldof course just be that we've been lucky.
 

If I can't get anywhere with that I'll try to post the analyse output in a better format.

Thanks again for your reply

Rob