Обсуждение: Query optimization

Поиск
Список
Период
Сортировка

Query optimization

От
Siva Kumar
Дата:
We have a query as below:
=========================================================
select relationship_id as "Id",company_name as "Company",product_category_desc
as "Product",uom_desc as "UOM",shipment_term_desc as "Shipment
Term",payment_term_desc as "Payment Term",city_name as "Port",currency_name
as "Currency",activity_desc as "Activity",credit_days as "Credit Days" from

 _100001relationships rs,master_member mm,member_product_details
mpd,master_product_category mpc,master_uom mu,master_shipment_term
mst,master_payment_term mpt,member_financial_details mfd,master_currency
mc,member_commercial_details mcd,master_activity ma,master_city mcy

 where mpd.product_details_id = rs.product_details_id and
mpd.product_category_id=mpc.product_category_id and mpd.uom_id = mu.uom_id
and rs.shipment_term_id = mst.shipment_term_id and rs.payment_term_id =
mpt.payment_term_id and rs.commercial_details_id = mcd.commercial_details_id
and mcd.port = mcy.city_id and rs.financial_details_id =
mfd.financial_details_id and mfd.currency_id = mc.currency_id and
mpd.activity_id = ma.activity_id and mm.member_id = mpd.member_id

ORDER BY rs.relationship_id DESC;
=====================================================

The decision to keep the fields in different tables was taken in view of the
overall need of the system (there might be scope for improvement here too).
This query normally select about 10-20 rows. The problem is, the page load
takes about 4-5 seconds in the local network. The query run in psql terminal
takes about 2 second to execute (outputing 3 rows).

When hosted on the internet with most of our users using dialup connections,
and the query returning 10+ rows, this will not be acceptable.

Please point us towards the right direction to handle this type of problems.

Best regards,

Siva Kumar

Re: Query optimization

От
Richard Huxton
Дата:
On Friday 04 Oct 2002 9:17 am, Siva Kumar wrote:
> We have a query as below:
[snip query with many joins]
>
> The decision to keep the fields in different tables was taken in view of
> the overall need of the system (there might be scope for improvement here
> too).

If that's the way the design makes sense, stick with it. It's better to get
Postgresql to handle a clean design rather than mangle a design.

> This query normally select about 10-20 rows. The problem is, the page
> load takes about 4-5 seconds in the local network. The query run in psql
> terminal takes about 2 second to execute (outputing 3 rows).
>
> When hosted on the internet with most of our users using dialup
> connections, and the query returning 10+ rows, this will not be acceptable.

Start by running EXPLAIN SELECT ... and looking at how the parser is handling
the query. One thing you might find useful is to use explicit JOINs to tell
Postgresql what order to connect the tables. You might prefere EXPLAIN
ANALYSE SELECT ... which will calculate actual times for each stage. See the
online manuals for details.

If you are missing indexes, you can add them. If Postgresql is not using
indexes you already have then we can look at why. Finally, if the plan looks
OK, we can look at tuning sort memory or similar.

First stage though, run an EXPLAIN and if you need help understanding it post
the output back to the list.

- Richard Huxton

Re: Query optimization

От
Neil Conway
Дата:
Richard Huxton <dev@archonet.com> writes:
> First stage though, run an EXPLAIN and if you need help understanding it post
> the output back to the list.

Yep, that's a good idea. Another thing to try is to disable GEQO (SET
geqo = off;) and see what difference that makes to the performance of
the optimizer and the quality of the chosen plan (12 relations is
close to the cross-over point for the useability of the genetic
optimizer, IME).

Also, your comment regarding slow access over the Internet is not a
problem with PostgreSQL -- if it really *is* 2x or so slower when done
remotely, talk to your network admin.

Cheers,

Neil

--
Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC

Re: Query optimization

От
Siva Kumar
Дата:
On Friday 04 Oct 2002 3:44 pm, you wrote:
> > The decision to keep the fields in different tables was taken in view of
> > the overall need of the system (there might be scope for improvement here
> > too).
>
> If that's the way the design makes sense, stick with it. It's better to get
> Postgresql to handle a clean design rather than mangle a design.

We had a relook at the design and managed to add one reduntant field in a
table which made sense otherwise also. This shaved some time from the earlier
query. now the query looks like this.
======================================
 select relationship_id as "Id",company_name as
"Company",product_category_desc as "Product",uom_desc as
"UOM",shipment_term_desc as "Shipment Term",payment_term_desc as "Payment
Term",city_name as "Port",currency_name as "Currency",activity_desc as
"Activity",credit_days as "Credit Days"

from _100001relationships rs,master_member mm,member_product_details
mpd,master_product_category mpc,master_uom mu,master_shipment_term
mst,master_payment_term mpt,member_financial_details mfd,master_currency
mc,member_commercial_details mcd,master_activity ma,master_city mcy

 where mpd.product_details_id = rs.product_details_id and
mpd.product_category_id=mpc.product_category_id and mpd.uom_id = mu.uom_id
and rs.shipment_term_id = mst.shipment_term_id and rs.payment_term_id =
mpt.payment_term_id and rs.commercial_details_id = mcd.commercial_details_id
and mcd.port = mcy.city_id and rs.financial_details_id =
mfd.financial_details_id and mfd.currency_id = mc.currency_id and
mpd.activity_id = ma.activity_id and mm.member_id = rs.partner_id ORDER BY
rs.relationship_id DESC;
==================================================

> First stage though, run an EXPLAIN and if you need help understanding it
> post the output back to the list.

Giving below the output of EXPLAIN ANALYSE. I could not make much sense out of
it, please help!

NOTICE:  QUERY PLAN:

Sort  (cost=1382.45..1382.45 rows=1000 width=442) (actual time=3.47..3.47
rows=3 loops=1)
  ->  Merge Join  (cost=1263.12..1332.62 rows=1000 width=442) (actual
time=3.21..3.27 rows=3 loops=1)
        ->  Index Scan using master_activity_pkey on master_activity ma
(cost=0.00..52.00 rows=1000 width=50) (actual time=0.06..0.09 rows=4 loops=1)
        ->  Sort  (cost=1263.12..1263.12 rows=1000 width=392) (actual
time=3.04..3.05 rows=3 loops=1)
              ->  Merge Join  (cost=1143.79..1213.29 rows=1000 width=392)
(actual time=2.87..2.95 rows=3 loops=1)
                    ->  Index Scan using master_member_pkey on master_member
mm  (cost=0.00..52.00 rows=1000 width=47) (actual time=0.03..0.23 rows=45
loops=1)
                    ->  Sort  (cost=1143.79..1143.79 rows=1000 width=345)
(actual time=2.48..2.48 rows=3 loops=1)
                          ->  Merge Join  (cost=1024.46..1093.96 rows=1000
width=345) (actual time=2.30..2.41 rows=3 loops=1)
                                ->  Index Scan using
master_product_category_pkey on master_product_category mpc
(cost=0.00..52.00 rows=1000 width=37) (actual time=0.02..0.08 rows=7 loops=1)
                                ->  Sort  (cost=1024.46..1024.46 rows=1000
width=308) (actual time=2.17..2.17 rows=3 loops=1)
                                      ->  Merge Join  (cost=905.13..974.63
rows=1000 width=308) (actual time=1.96..2.10 rows=3 loops=1)
                                            ->  Index Scan using
master_uom_pkey on master_uom mu  (cost=0.00..52.00 rows=1000 width=41)
(actual time=0.03..0.06 rows=6 loops=1)
                                            ->  Sort  (cost=905.13..905.13
rows=1000 width=267) (actual time=1.87..1.87 rows=3 loops=1)
                                                  ->  Merge Join
(cost=785.80..855.30 rows=1000 width=267) (actual time=1.70..1.77 rows=3
loops=1)
                                                        ->  Index Scan using
member_product_details_pkey on member_product_details mpd  (cost=0.00..52.00
rows=1000 width=23) (actual time=0.03..0.11 rows=14 loops=1)
                                                        ->  Sort
(cost=785.80..785.80 rows=1000 width=244) (actual time=1.53..1.53 rows=3
loops=1)
                                                              ->  Merge Join
(cost=666.47..735.97 rows=1000 width=244) (actual time=1.35..1.47 rows=3
loops=1)
                                                                    ->  Index
Scan using master_shipment_term_pkey on master_shipment_term mst
(cost=0.00..52.00 rows=1000 width=50) (actual time=0.02..0.06 rows=5 loops=1)
                                                                    ->  Sort
(cost=666.47..666.47 rows=1000 width=194) (actual time=1.27..1.27 rows=3
loops=1)
                                                                          ->
Merge Join  (cost=547.14..616.64 rows=1000 width=194) (actual time=1.12..1.19
rows=3 loops=1)

->  Index Scan using master_city_pkey on master_city mcy  (cost=0.00..52.00
rows=1000 width=47) (actual time=0.02..0.04 rows=4 loops=1)

->  Sort  (cost=547.14..547.14 rows=1000 width=147) (actual time=1.05..1.06
rows=3 loops=1)

->  Merge Join  (cost=427.82..497.32 rows=1000 width=147) (actual
time=0.92..1.00 rows=3 loops=1)

->  Index Scan using master_currency_pkey on master_currency mc
(cost=0.00..52.00 rows=1000 width=40) (actual time=0.03..0.06 rows=4 loops=1)

->  Sort  (cost=427.82..427.82 rows=1000 width=107) (actual time=0.81..0.81
rows=3 loops=1)

->  Merge Join  (cost=308.49..377.99 rows=1000 width=107) (actual
time=0.69..0.74 rows=3 loops=1)

->  Index Scan using member_financial_details_pkey on
member_financial_details mfd  (cost=0.00..52.00 rows=1000 width=11) (actual
time=0.03..0.08 rows=11 loops=1)

->  Sort  (cost=308.49..308.49 rows=1000 width=96) (actual time=0.56..0.57
rows=3 loops=1)

->  Merge Join  (cost=189.16..258.66 rows=1000 width=96) (actual
time=0.46..0.51 rows=3 loops=1)

->  Index Scan using master_payment_term_pkey on master_payment_term mpt
(cost=0.00..52.00 rows=1000 width=50) (actual time=0.02..0.04 rows=4 loops=1)

->  Sort  (cost=189.16..189.16 rows=1000 width=46) (actual time=0.35..0.35
rows=3 loops=1)

  
->  Merge Join  (cost=69.83..139.33 rows=1000 width=46) (actual
time=0.21..0.25 rows=3 loops=1)

        
->  Index Scan using member_commercial_details_pkey on
member_commercial_details mcd  (cost=0.00..52.00 rows=1000 width=8) (actual
time=0.02..0.06 rows=6 loops=1)

        
->  Sort  (cost=69.83..69.83 rows=1000 width=38) (actual time=0.11..0.11
rows=3 loops=1)

              
->  Seq Scan on _100001relationships rs  (cost=0.00..20.00 rows=1000
width=38) (actual time=0.03..0.05 rows=3 loops=1)
Total runtime: 5.45 msec

=====================================================

Best regards,


Siva Kumar





Re: Query optimization

От
Jochem van Dieten
Дата:
Quoting Siva Kumar <tech@leatherlink.net>:
>
>
> Giving below the output of EXPLAIN ANALYSE. I could not make much
> sense out of
> it, please help!
>
> NOTICE:  QUERY PLAN:
>
> Sort  (cost=1382.45..1382.45 rows=1000 width=442) (actual
> time=3.47..3.47
> rows=3 loops=1)
>   ->  Merge Join  (cost=1263.12..1332.62 rows=1000 width=442) (actual
>
> time=3.21..3.27 rows=3 loops=1)
>         ->  Index Scan using master_activity_pkey on master_activity
> ma
> (cost=0.00..52.00 rows=1000 width=50) (actual time=0.06..0.09 rows=4
> loops=1)

EXPLAIN returns the way that PostgreSQL will try to get data from the
different tables and indexes and combine that together to produce the
query result. To choose the best way, PostgreSQL is dependent on
statistical information on the content of the tables. Those statistics
are not collected automatically, you need to tell PostgreSQL to collect
these statistics.

Your EXPLAIN plan always returns an estimate of 1000 rows for each
operation. This is usually an indication that you didn't gather
statistics yet. You need to run ANALYZE first and then try again.
Also, take a look at the manual entry for VACUUM, which is another
maintenance operation that can be critical for performance.

If your database is not in production, I would recommend you do a
VACUUM FULL ANALYZE and then try the EXPLAIN again. If your database is
in production, do VACUUM ANALYZE and read up on the FULL part so you
know when to use that.
After that, see if performance got better and post the new EXPLAIN
output.

Jochem

Re: Query optimization

От
Antonis Antoniou
Дата:


Siva Kumar wrote:

>On Friday 04 Oct 2002 3:44 pm, you wrote:
>
>
>>>The decision to keep the fields in different tables was taken in view of
>>>the overall need of the system (there might be scope for improvement here
>>>too).
>>>
>>>
>>If that's the way the design makes sense, stick with it. It's better to get
>>Postgresql to handle a clean design rather than mangle a design.
>>
>>
>
>We had a relook at the design and managed to add one reduntant field in a
>table which made sense otherwise also. This shaved some time from the earlier
>query. now the query looks like this.
>======================================
> select relationship_id as "Id",company_name as
>"Company",product_category_desc as "Product",uom_desc as
>"UOM",shipment_term_desc as "Shipment Term",payment_term_desc as "Payment
>Term",city_name as "Port",currency_name as "Currency",activity_desc as
>"Activity",credit_days as "Credit Days"
>
>from _100001relationships rs,master_member mm,member_product_details
>mpd,master_product_category mpc,master_uom mu,master_shipment_term
>mst,master_payment_term mpt,member_financial_details mfd,master_currency
>mc,member_commercial_details mcd,master_activity ma,master_city mcy
>
> where mpd.product_details_id = rs.product_details_id and
>mpd.product_category_id=mpc.product_category_id and mpd.uom_id = mu.uom_id
>and rs.shipment_term_id = mst.shipment_term_id and rs.payment_term_id =
>mpt.payment_term_id and rs.commercial_details_id = mcd.commercial_details_id
>and mcd.port = mcy.city_id and rs.financial_details_id =
>mfd.financial_details_id and mfd.currency_id = mc.currency_id and
>mpd.activity_id = ma.activity_id and mm.member_id = rs.partner_id ORDER BY
>rs.relationship_id DESC;
>==================================================
>
>
>
>>First stage though, run an EXPLAIN and if you need help understanding it
>>post the output back to the list.
>>
>>
>
>Giving below the output of EXPLAIN ANALYSE. I could not make much sense out of
>it, please help!
>
>NOTICE:  QUERY PLAN:
>
>Sort  (cost=1382.45..1382.45 rows=1000 width=442) (actual time=3.47..3.47
>rows=3 loops=1)
>  ->  Merge Join  (cost=1263.12..1332.62 rows=1000 width=442) (actual
>time=3.21..3.27 rows=3 loops=1)
>        ->  Index Scan using master_activity_pkey on master_activity ma
>(cost=0.00..52.00 rows=1000 width=50) (actual time=0.06..0.09 rows=4 loops=1)
>        ->  Sort  (cost=1263.12..1263.12 rows=1000 width=392) (actual
>time=3.04..3.05 rows=3 loops=1)
>              ->  Merge Join  (cost=1143.79..1213.29 rows=1000 width=392)
>(actual time=2.87..2.95 rows=3 loops=1)
>                    ->  Index Scan using master_member_pkey on master_member
>mm  (cost=0.00..52.00 rows=1000 width=47) (actual time=0.03..0.23 rows=45
>loops=1)
>                    ->  Sort  (cost=1143.79..1143.79 rows=1000 width=345)
>(actual time=2.48..2.48 rows=3 loops=1)
>                          ->  Merge Join  (cost=1024.46..1093.96 rows=1000
>width=345) (actual time=2.30..2.41 rows=3 loops=1)
>                                ->  Index Scan using
>master_product_category_pkey on master_product_category mpc
>(cost=0.00..52.00 rows=1000 width=37) (actual time=0.02..0.08 rows=7 loops=1)
>                                ->  Sort  (cost=1024.46..1024.46 rows=1000
>width=308) (actual time=2.17..2.17 rows=3 loops=1)
>                                      ->  Merge Join  (cost=905.13..974.63
>rows=1000 width=308) (actual time=1.96..2.10 rows=3 loops=1)
>                                            ->  Index Scan using
>master_uom_pkey on master_uom mu  (cost=0.00..52.00 rows=1000 width=41)
>(actual time=0.03..0.06 rows=6 loops=1)
>                                            ->  Sort  (cost=905.13..905.13
>rows=1000 width=267) (actual time=1.87..1.87 rows=3 loops=1)
>                                                  ->  Merge Join
>(cost=785.80..855.30 rows=1000 width=267) (actual time=1.70..1.77 rows=3
>loops=1)
>                                                        ->  Index Scan using
>member_product_details_pkey on member_product_details mpd  (cost=0.00..52.00
>rows=1000 width=23) (actual time=0.03..0.11 rows=14 loops=1)
>                                                        ->  Sort
>(cost=785.80..785.80 rows=1000 width=244) (actual time=1.53..1.53 rows=3
>loops=1)
>                                                              ->  Merge Join
>(cost=666.47..735.97 rows=1000 width=244) (actual time=1.35..1.47 rows=3
>loops=1)
>                                                                    ->  Index
>Scan using master_shipment_term_pkey on master_shipment_term mst
>(cost=0.00..52.00 rows=1000 width=50) (actual time=0.02..0.06 rows=5 loops=1)
>                                                                    ->  Sort
>(cost=666.47..666.47 rows=1000 width=194) (actual time=1.27..1.27 rows=3
>loops=1)
>                                                                          ->
>Merge Join  (cost=547.14..616.64 rows=1000 width=194) (actual time=1.12..1.19
>rows=3 loops=1)
>
>->  Index Scan using master_city_pkey on master_city mcy  (cost=0.00..52.00
>rows=1000 width=47) (actual time=0.02..0.04 rows=4 loops=1)
>
>->  Sort  (cost=547.14..547.14 rows=1000 width=147) (actual time=1.05..1.06
>rows=3 loops=1)
>
>->  Merge Join  (cost=427.82..497.32 rows=1000 width=147) (actual
>time=0.92..1.00 rows=3 loops=1)
>
>->  Index Scan using master_currency_pkey on master_currency mc
>(cost=0.00..52.00 rows=1000 width=40) (actual time=0.03..0.06 rows=4 loops=1)
>
>->  Sort  (cost=427.82..427.82 rows=1000 width=107) (actual time=0.81..0.81
>rows=3 loops=1)
>
>->  Merge Join  (cost=308.49..377.99 rows=1000 width=107) (actual
>time=0.69..0.74 rows=3 loops=1)
>
>->  Index Scan using member_financial_details_pkey on
>member_financial_details mfd  (cost=0.00..52.00 rows=1000 width=11) (actual
>time=0.03..0.08 rows=11 loops=1)
>
>->  Sort  (cost=308.49..308.49 rows=1000 width=96) (actual time=0.56..0.57
>rows=3 loops=1)
>
>->  Merge Join  (cost=189.16..258.66 rows=1000 width=96) (actual
>time=0.46..0.51 rows=3 loops=1)
>
>->  Index Scan using master_payment_term_pkey on master_payment_term mpt
>(cost=0.00..52.00 rows=1000 width=50) (actual time=0.02..0.04 rows=4 loops=1)
>
>->  Sort  (cost=189.16..189.16 rows=1000 width=46) (actual time=0.35..0.35
>rows=3 loops=1)
>
   
>->  Merge Join  (cost=69.83..139.33 rows=1000 width=46) (actual
>time=0.21..0.25 rows=3 loops=1)
>
         
>->  Index Scan using member_commercial_details_pkey on
>member_commercial_details mcd  (cost=0.00..52.00 rows=1000 width=8) (actual
>time=0.02..0.06 rows=6 loops=1)
>
         
>->  Sort  (cost=69.83..69.83 rows=1000 width=38) (actual time=0.11..0.11
>rows=3 loops=1)
>
               
>->  Seq Scan on _100001relationships rs  (cost=0.00..20.00 rows=1000
>width=38) (actual time=0.03..0.05 rows=3 loops=1)
>Total runtime: 5.45 msec
>
>=====================================================
>
>Best regards,
>
>
>Siva Kumar
>
>
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>
>
>
http://www.postgresql.org/idocs/index.php?performance-tips.html
First, read this documentation.

Thanks
Antonis


Re: Query optimization

От
Richard Huxton
Дата:
On Friday 04 Oct 2002 1:26 pm, Siva Kumar wrote:
> Giving below the output of EXPLAIN ANALYSE. I could not make much sense out
> of it, please help!

Scary aren't they ;-)

The important thing is you've got plenty of "Index Scan"s rather than Seq
Scans. The other point is the final time:

> Total runtime: 5.45 msec

Now, since that's not what you're getting, I'd think Neil (see other reply) is
right and you need to look at issuing "SET geqo = off;" before the query.

Postgresql has a genetic algorithm that kicks in on what it thinks is a very
complex query, this can take a long time to analyse the options available but
pays dividends on a big query. In your case you're only getting a few rows
and so it takes longer to analyse than to get the results.

The other thing that might work is rewriting the query with explicit JOINs - I
think that should make it clear to Postgresql what order to do things in.

- Richard Huxton

Re: Query optimization

От
Siva Kumar
Дата:
Thanks to Richard, Neil, Jochem and Antonis.

I went through the documentation on explain/analyse. Finally, decided to split
up the query into two and to reduce the number of tables to get faster query.
In short, I have not solved the problem as presented (too lazy of me :-( ,
but opted to walk an easier path.

Thanks once again.

Best regards,

Siva Kumar


On Friday 04 Oct 2002 6:59 pm, Richard Huxton wrote:
> On Friday 04 Oct 2002 1:26 pm, Siva Kumar wrote:
> > Giving below the output of EXPLAIN ANALYSE. I could not make much sense
> > out of it, please help!
>
> Scary aren't they ;-)
>
> The important thing is you've got plenty of "Index Scan"s rather than Seq
>
> Scans. The other point is the final time:
> > Total runtime: 5.45 msec
>
> Now, since that's not what you're getting, I'd think Neil (see other reply)
> is right and you need to look at issuing "SET geqo = off;" before the
> query.
>
> Postgresql has a genetic algorithm that kicks in on what it thinks is a
> very complex query, this can take a long time to analyse the options
> available but pays dividends on a big query. In your case you're only
> getting a few rows and so it takes longer to analyse than to get the
> results.
>
> The other thing that might work is rewriting the query with explicit JOINs
> - I think that should make it clear to Postgresql what order to do things
> in.
>
> - Richard Huxton