Обсуждение: Identical query slower on 8.4 vs 8.3

От:
Patrick Donlin
Дата:

I have two servers with equal specs, one of them running 8.3.7 and the new server running 8.4.4. The only tweak I have made from the default install (from Ubuntu repositories) is increasing shared_buffers to 768MB. Both servers are running 64-bit, but are different releases of Ubuntu.

This is the query I am running:

SELECT DISTINCT test.tid, testresult.trscore, testresult.trpossiblescore, testresult.trstart,
testresult.trfinish, testresult.trscorebreakdown, testresult.fk_sid, testresult.fk_tid, test.tname,
qr.qrscore, qr.qrtotalscore, testresult.trid, qr.qrid
FROM testresult, test, questionresult qr
WHERE test.tid = testresult.fk_tid AND qr.fk_trid = testresult.trid
ORDER BY test.tid;

Results when running on the v8.3.7 server....
Total query runtime: 32185 ms.
700536 rows retrieved.

Results when running on the v8.4.4 server....
Total query runtime: 164227 ms.
700536 rows retrieved.

Results when running on a different v8.4.4 server with slightly faster hardware and shared_buffers at 1024MB....
(this one has a few more rows of data due to this being the server that is currently live, so it has more recent data)
Total query runtime: 157931 ms.
700556 rows retrieved.


Anyone have any ideas on where I should start looking to figure this out? I didn't perform any special steps when moving to v8.4, I just did a pg_dump from the 8.3 server and restored it on the new 8.4 servers. Maybe that is where I made a mistake.

Thanks!
Patrick
От:
Thom Brown
Дата:

On 15 July 2010 15:41, Patrick Donlin <> wrote:
> I have two servers with equal specs, one of them running 8.3.7 and the new
> server running 8.4.4. The only tweak I have made from the default install
> (from Ubuntu repositories) is increasing shared_buffers to 768MB. Both
> servers are running 64-bit, but are different releases of Ubuntu.
>
> This is the query I am running:
>
> SELECT DISTINCT test.tid, testresult.trscore, testresult.trpossiblescore,
> testresult.trstart,
> testresult.trfinish, testresult.trscorebreakdown, testresult.fk_sid,
> testresult.fk_tid, test.tname,
> qr.qrscore, qr.qrtotalscore, testresult.trid, qr.qrid
> FROM testresult, test, questionresult qr
> WHERE test.tid = testresult.fk_tid AND qr.fk_trid = testresult.trid
> ORDER BY test.tid;
>
> Results when running on the v8.3.7 server....
> Total query runtime: 32185 ms.
> 700536 rows retrieved.
>
> Results when running on the v8.4.4 server....
> Total query runtime: 164227 ms.
> 700536 rows retrieved.
>
> Results when running on a different v8.4.4 server with slightly faster
> hardware and shared_buffers at 1024MB....
> (this one has a few more rows of data due to this being the server that is
> currently live, so it has more recent data)
> Total query runtime: 157931 ms.
> 700556 rows retrieved.
>
>
> Anyone have any ideas on where I should start looking to figure this out? I
> didn't perform any special steps when moving to v8.4, I just did a pg_dump
> from the 8.3 server and restored it on the new 8.4 servers. Maybe that is
> where I made a mistake.
>
> Thanks!
> Patrick
>

First thing to check is did you do a VACUUM ANALYZE on the database?

Thom

От:
"Kevin Grittner"
Дата:

Patrick Donlin <> wrote:

> Anyone have any ideas on where I should start looking to figure
> this out?

You're going to want to run EXPLAIN ANALYZE for the slow query on
both servers.  If you want the rest of us to be able to contribute
ideas, we'll need a little more information -- please read this
page:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

> I didn't perform any special steps when moving to v8.4, I just did
> a pg_dump from the 8.3 server and restored it on the new 8.4
> servers.

A database VACUUM ANALYZE by a superuser is a good idea; otherwise
that's fine technique.

-Kevin

От:
Patrick Donlin
Дата:

I'll read over that wiki entry, but for now here is the EXPLAIN ANALYZE output assuming I did it correctly. I have run vacuumdb --full --analyze,  it actually runs as a nightly cron job.

8.4.4 Sever:
"Unique  (cost=202950.82..227521.59 rows=702022 width=86) (actual time=21273.371..22429.511 rows=700536 loops=1)"
"  ->  Sort  (cost=202950.82..204705.87 rows=702022 width=86) (actual time=21273.368..22015.948 rows=700536 loops=1)"
"        Sort Key: test.tid, testresult.trscore, testresult.trpossiblescore, testresult.trstart, testresult.trfinish, testresult.trscorebreakdown, testresult.fk_sid, test.tname, qr.qrscore, qr.qrtotalscore, testresult.trid, qr.qrid"
"        Sort Method:  external merge  Disk: 71768kB"
"        ->  Hash Join  (cost=2300.82..34001.42 rows=702022 width=86) (actual time=64.388..1177.468 rows=700536 loops=1)"
"              Hash Cond: (qr.fk_trid = testresult.trid)"
"              ->  Seq Scan on questionresult qr  (cost=0.00..12182.22 rows=702022 width=16) (actual time=0.090..275.518 rows=702022 loops=1)"
"              ->  Hash  (cost=1552.97..1552.97 rows=29668 width=74) (actual time=63.042..63.042 rows=29515 loops=1)"
"                    ->  Hash Join  (cost=3.35..1552.97 rows=29668 width=74) (actual time=0.227..39.111 rows=29515 loops=1)"
"                          Hash Cond: (testresult.fk_tid = test.tid)"
"                          ->  Seq Scan on testresult  (cost=0.00..1141.68 rows=29668 width=53) (actual time=0.019..15.622 rows=29668 loops=1)"
"                          ->  Hash  (cost=2.60..2.60 rows=60 width=21) (actual time=0.088..0.088 rows=60 loops=1)"
"                                ->  Seq Scan on test  (cost=0.00..2.60 rows=60 width=21) (actual time=0.015..0.044 rows=60 loops=1)"
"Total runtime: 22528.820 ms"

8.3.7 Server:
"Unique  (cost=202950.82..227521.59 rows=702022 width=86) (actual time=22157.714..23343.461 rows=700536 loops=1)"
"  ->  Sort  (cost=202950.82..204705.87 rows=702022 width=86) (actual time=22157.706..22942.018 rows=700536 loops=1)"
"        Sort Key: test.tid, testresult.trscore, testresult.trpossiblescore, testresult.trstart, testresult.trfinish, testresult.trscorebreakdown, testresult.fk_sid, test.tname, qr.qrscore, qr.qrtotalscore, testresult.trid, qr.qrid"
"        Sort Method:  external merge  Disk: 75864kB"
"        ->  Hash Join  (cost=2300.82..34001.42 rows=702022 width=86) (actual time=72.842..1276.634 rows=700536 loops=1)"
"              Hash Cond: (qr.fk_trid = testresult.trid)"
"              ->  Seq Scan on questionresult qr  (cost=0.00..12182.22 rows=702022 width=16) (actual time=0.112..229.987 rows=702022 loops=1)"
"              ->  Hash  (cost=1552.97..1552.97 rows=29668 width=74) (actual time=71.421..71.421 rows=29515 loops=1)"
"                    ->  Hash Join  (cost=3.35..1552.97 rows=29668 width=74) (actual time=0.398..44.524 rows=29515 loops=1)"
"                          Hash Cond: (testresult.fk_tid = test.tid)"
"                          ->  Seq Scan on testresult  (cost=0.00..1141.68 rows=29668 width=53) (actual time=0.117..20.890 rows=29668 loops=1)"
"                          ->  Hash  (cost=2.60..2.60 rows=60 width=21) (actual time=0.112..0.112 rows=60 loops=1)"
"                                ->  Seq Scan on test  (cost=0.00..2.60 rows=60 width=21) (actual time=0.035..0.069 rows=60 loops=1)"
"Total runtime: 23462.639 ms"


Thanks for the quick responses and being patient with me not providing enough information.
-Patrick

----- Original Message -----
From: "Kevin Grittner" <>
To: "Patrick Donlin" <>,
Sent: Thursday, July 15, 2010 10:55:19 AM GMT -05:00 US/Canada Eastern
Subject: Re: [PERFORM] Identical query slower on 8.4 vs 8.3

Patrick Donlin <> wrote:
 
> Anyone have any ideas on where I should start looking to figure
> this out?
 
You're going to want to run EXPLAIN ANALYZE for the slow query on
both servers.  If you want the rest of us to be able to contribute
ideas, we'll need a little more information -- please read this
page:
 
http://wiki.postgresql.org/wiki/SlowQueryQuestions
 
> I didn't perform any special steps when moving to v8.4, I just did
> a pg_dump from the 8.3 server and restored it on the new 8.4
> servers.
 
A database VACUUM ANALYZE by a superuser is a good idea; otherwise
that's fine technique.
 
-Kevin
От:
Alvaro Herrera
Дата:

Excerpts from Patrick Donlin's message of jue jul 15 11:12:53 -0400 2010:
> I'll read over that wiki entry, but for now here is the EXPLAIN ANALYZE output assuming I did it correctly. I have
runvacuumdb --full --analyze, it actually runs as a nightly cron job.  

These plans seem identical (though the fact that the leading whitespace
was trimmed means it's untrustworthy -- please in the future send them
as text attachments instead so that your mailer doesn't interfere with
formatting).  The 8.4 plan is even a full second faster, according to
the "total runtime" line.

The slowness could've been caused by caching effects ...

От:
Andy Colson
Дата:

FULL is usually bad.  Stick to "vacuum analyze" and drop the full.

Do you have indexes on:

test.tid, testresult.fk_tid, questionresult.fk_trid and testresult.trid


-Andy



On 7/15/2010 10:12 AM, Patrick Donlin wrote:
> I'll read over that wiki entry, but for now here is the EXPLAIN ANALYZE
> output assuming I did it correctly. I have run vacuumdb --full
> --analyze, it actually runs as a nightly cron job.
>
> 8.4.4 Sever:
> "Unique (cost=202950.82..227521.59 rows=702022 width=86) (actual
> time=21273.371..22429.511 rows=700536 loops=1)"
> " -> Sort (cost=202950.82..204705.87 rows=702022 width=86) (actual
> time=21273.368..22015.948 rows=700536 loops=1)"
> " Sort Key: test.tid, testresult.trscore, testresult.trpossiblescore,
> testresult.trstart, testresult.trfinish, testresult.trscorebreakdown,
> testresult.fk_sid, test.tname, qr.qrscore, qr.qrtotalscore,
> testresult.trid, qr.qrid"
> " Sort Method: external merge Disk: 71768kB"
> " -> Hash Join (cost=2300.82..34001.42 rows=702022 width=86) (actual
> time=64.388..1177.468 rows=700536 loops=1)"
> " Hash Cond: (qr.fk_trid = testresult.trid)"
> " -> Seq Scan on questionresult qr (cost=0.00..12182.22 rows=702022
> width=16) (actual time=0.090..275.518 rows=702022 loops=1)"
> " -> Hash (cost=1552.97..1552.97 rows=29668 width=74) (actual
> time=63.042..63.042 rows=29515 loops=1)"
> " -> Hash Join (cost=3.35..1552.97 rows=29668 width=74) (actual
> time=0.227..39.111 rows=29515 loops=1)"
> " Hash Cond: (testresult.fk_tid = test.tid)"
> " -> Seq Scan on testresult (cost=0.00..1141.68 rows=29668 width=53)
> (actual time=0.019..15.622 rows=29668 loops=1)"
> " -> Hash (cost=2.60..2.60 rows=60 width=21) (actual time=0.088..0.088
> rows=60 loops=1)"
> " -> Seq Scan on test (cost=0.00..2.60 rows=60 width=21) (actual
> time=0.015..0.044 rows=60 loops=1)"
> "Total runtime: 22528.820 ms"
>
> 8.3.7 Server:
> "Unique (cost=202950.82..227521.59 rows=702022 width=86) (actual
> time=22157.714..23343.461 rows=700536 loops=1)"
> " -> Sort (cost=202950.82..204705.87 rows=702022 width=86) (actual
> time=22157.706..22942.018 rows=700536 loops=1)"
> " Sort Key: test.tid, testresult.trscore, testresult.trpossiblescore,
> testresult.trstart, testresult.trfinish, testresult.trscorebreakdown,
> testresult.fk_sid, test.tname, qr.qrscore, qr.qrtotalscore,
> testresult.trid, qr.qrid"
> " Sort Method: external merge Disk: 75864kB"
> " -> Hash Join (cost=2300.82..34001.42 rows=702022 width=86) (actual
> time=72.842..1276.634 rows=700536 loops=1)"
> " Hash Cond: (qr.fk_trid = testresult.trid)"
> " -> Seq Scan on questionresult qr (cost=0.00..12182.22 rows=702022
> width=16) (actual time=0.112..229.987 rows=702022 loops=1)"
> " -> Hash (cost=1552.97..1552.97 rows=29668 width=74) (actual
> time=71.421..71.421 rows=29515 loops=1)"
> " -> Hash Join (cost=3.35..1552.97 rows=29668 width=74) (actual
> time=0.398..44.524 rows=29515 loops=1)"
> " Hash Cond: (testresult.fk_tid = test.tid)"
> " -> Seq Scan on testresult (cost=0.00..1141.68 rows=29668 width=53)
> (actual time=0.117..20.890 rows=29668 loops=1)"
> " -> Hash (cost=2.60..2.60 rows=60 width=21) (actual time=0.112..0.112
> rows=60 loops=1)"
> " -> Seq Scan on test (cost=0.00..2.60 rows=60 width=21) (actual
> time=0.035..0.069 rows=60 loops=1)"
> "Total runtime: 23462.639 ms"
>
>
> Thanks for the quick responses and being patient with me not providing
> enough information.
> -Patrick
>
> ----- Original Message -----
> From: "Kevin Grittner" <>
> To: "Patrick Donlin" <>, 
> Sent: Thursday, July 15, 2010 10:55:19 AM GMT -05:00 US/Canada Eastern
> Subject: Re: [PERFORM] Identical query slower on 8.4 vs 8.3
>
> Patrick Donlin <> wrote:
>
>  > Anyone have any ideas on where I should start looking to figure
>  > this out?
>
> You're going to want to run EXPLAIN ANALYZE for the slow query on
> both servers. If you want the rest of us to be able to contribute
> ideas, we'll need a little more information -- please read this
> page:
>
> http://wiki.postgresql.org/wiki/SlowQueryQuestions
>
>  > I didn't perform any special steps when moving to v8.4, I just did
>  > a pg_dump from the 8.3 server and restored it on the new 8.4
>  > servers.
>
> A database VACUUM ANALYZE by a superuser is a good idea; otherwise
> that's fine technique.
>
> -Kevin


От:
Jon Nelson
Дата:

On Thu, Jul 15, 2010 at 9:41 AM, Patrick Donlin <> wrote:
> I have two servers with equal specs, one of them running 8.3.7 and the new
> server running 8.4.4. The only tweak I have made from the default install
> (from Ubuntu repositories) is increasing shared_buffers to 768MB. Both
> servers are running 64-bit, but are different releases of Ubuntu.

^^^ Right there.  *different releases*. I've seen fairly significant
differences in identical hardware with even minor O/S point releases.

After you run a full vacuum and then reindex and then vacuum analyze
(probably not entirely necessary) if there is still a difference I'd
point at the O/S.




--
Jon

От:
"Kevin Grittner"
Дата:

Patrick Donlin <> wrote:

> I have run vacuumdb --full --analyze, it
> actually runs as a nightly cron job.

That's usually not wise -- VACUUM FULL can cause index bloat, and is
not normally necessary.  If you have autovacuum turned on and run a
database vacuum each night, you can probably avoid ever running
VACUUM FULL.  A long-running transaction or mass deletes might still
make aggressive cleanup necessary on occasion, but you should
consider using CLUSTER instead of VACUUM FULL.  So, you should
probably change your crontab job to vacuum --all --analyze.

Also, after a bulk load of a database like this you might consider a
one-time VACUUM FREEZE of the database.  Without that there will
come a time when autovacuum will need to rewrite all rows from the
bulk load which haven't subsequently been modified, in order to
prevent transaction ID wraparound problems.

-Kevin

От:
Merlin Moncure
Дата:

On Thu, Jul 15, 2010 at 11:12 AM, Patrick Donlin <> wrote:
> I'll read over that wiki entry, but for now here is the EXPLAIN ANALYZE
> output assuming I did it correctly. I have run vacuumdb --full --analyze,
> it actually runs as a nightly cron job.
>
> 8.4.4 Sever:
> "Unique  (cost=202950.82..227521.59 rows=702022 width=86) (actual
> time=21273.371..22429.511 rows=700536 loops=1)"
> "  ->  Sort  (cost=202950.82..204705.87 rows=702022 width=86) (actual
> time=21273.368..22015.948 rows=700536 loops=1)"
> "        Sort Key: test.tid, testresult.trscore, testresult.trpossiblescore,
> testresult.trstart, testresult.trfinish, testresult.trscorebreakdown,
> testresult.fk_sid, test.tname, qr.qrscore, qr.qrtotalscore, testresult.trid,
> qr.qrid"
> "        Sort Method:  external merge  Disk: 71768kB"
> "        ->  Hash Join  (cost=2300.82..34001.42 rows=702022 width=86)
> (actual time=64.388..1177.468 rows=700536 loops=1)"
> "              Hash Cond: (qr.fk_trid = testresult.trid)"
> "              ->  Seq Scan on questionresult qr  (cost=0.00..12182.22
> rows=702022 width=16) (actual time=0.090..275.518 rows=702022 loops=1)"
> "              ->  Hash  (cost=1552.97..1552.97 rows=29668 width=74) (actual
> time=63.042..63.042 rows=29515 loops=1)"
> "                    ->  Hash Join  (cost=3.35..1552.97 rows=29668 width=74)
> (actual time=0.227..39.111 rows=29515 loops=1)"
> "                          Hash Cond: (testresult.fk_tid = test.tid)"
> "                          ->  Seq Scan on testresult  (cost=0.00..1141.68
> rows=29668 width=53) (actual time=0.019..15.622 rows=29668 loops=1)"
> "                          ->  Hash  (cost=2.60..2.60 rows=60 width=21)
> (actual time=0.088..0.088 rows=60 loops=1)"
> "                                ->  Seq Scan on test  (cost=0.00..2.60
> rows=60 width=21) (actual time=0.015..0.044 rows=60 loops=1)"
> "Total runtime: 22528.820 ms"
>
> 8.3.7 Server:
> "Unique  (cost=202950.82..227521.59 rows=702022 width=86) (actual
> time=22157.714..23343.461 rows=700536 loops=1)"
> "  ->  Sort  (cost=202950.82..204705.87 rows=702022 width=86) (actual
> time=22157.706..22942.018 rows=700536 loops=1)"
> "        Sort Key: test.tid, testresult.trscore, testresult.trpossiblescore,
> testresult.trstart, testresult.trfinish, testresult.trscorebreakdown,
> testresult.fk_sid, test.tname, qr.qrscore, qr.qrtotalscore, testresult.trid,
> qr.qrid"
> "        Sort Method:  external merge  Disk: 75864kB"
> "        ->  Hash Join  (cost=2300.82..34001.42 rows=702022 width=86)
> (actual time=72.842..1276.634 rows=700536 loops=1)"
> "              Hash Cond: (qr.fk_trid = testresult.trid)"
> "              ->  Seq Scan on questionresult qr  (cost=0.00..12182.22
> rows=702022 width=16) (actual time=0.112..229.987 rows=702022 loops=1)"
> "              ->  Hash  (cost=1552.97..1552.97 rows=29668 width=74) (actual
> time=71.421..71.421 rows=29515 loops=1)"
> "                    ->  Hash Join  (cost=3.35..1552.97 rows=29668 width=74)
> (actual time=0.398..44.524 rows=29515 loops=1)"
> "                          Hash Cond: (testresult.fk_tid = test.tid)"
> "                          ->  Seq Scan on testresult  (cost=0.00..1141.68
> rows=29668 width=53) (actual time=0.117..20.890 rows=29668 loops=1)"
> "                          ->  Hash  (cost=2.60..2.60 rows=60 width=21)
> (actual time=0.112..0.112 rows=60 loops=1)"
> "                                ->  Seq Scan on test  (cost=0.00..2.60
> rows=60 width=21) (actual time=0.035..0.069 rows=60 loops=1)"
> "Total runtime: 23462.639 ms"

your plans are identical as is the runtime basically.  this means you
might want to consider the following possibilities:
*) operator error :-)
*) cache effects
*) environmental factors on the server at the time
*) network/client issues

I say network issues because if your explain analyze (which actually
does run the entire query) is significantly faster than the full
query, then we have to consider that the formatting and transfer of
the data back to the client (even if it's on the same box) becomes
suspicious.  If you've eliminated other possibilities, try running
other big, trivially planned, mucho result returning queries (like
select * from table) on both servers and comparing times.

merlin

От:
"Joshua D. Drake"
Дата:

On Thu, 2010-07-15 at 10:41 -0400, Patrick Donlin wrote:

> Results when running on the v8.3.7 server....
> Total query runtime: 32185 ms.
> 700536 rows retrieved.
>
> Results when running on the v8.4.4 server....
> Total query runtime: 164227 ms.
> 700536 rows retrieved.
>

>
> Anyone have any ideas on where I should start looking to figure this
> out? I didn't perform any special steps when moving to v8.4, I just
> did a pg_dump from the 8.3 server and restored it on the new 8.4
> servers. Maybe that is where I made a mistake.

Three immediate things come to mind:

1. One had relations in file or shared buffer cache, the other didn't
2. One is running ext4 versus ext3 and when you end up spilling to disk
when you over run work_mem, the ext4 machine is faster, but without
knowing which machine is which it is a bit tough to diagnose.
3. You didn't run ANALYZE on one of the machines

Sincerely,

Joshua D. Drake

>
> Thanks!
> Patrick
>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
От:
Patrick Donlin
Дата:

Thanks everyone for the input so far, Merlin's comment about the network gave me one of those duh moments since I have been running these queries remotely using pgadmin. I will experiment with this more tomorrow/Monday along with the other suggestions that have been posted to hopefully narrow it down. Running the query from my webserver yielded much better times, but from a quick look it seems my 8.4 server is still a bit slower. I will share more details as I dig into it more tomorrow or Monday.

-Patrick

----- Original Message -----
From: "Merlin Moncure" <>
To: "Patrick Donlin" <>
Cc: "Kevin Grittner" <>,
Sent: Thursday, July 15, 2010 12:04:13 PM GMT -05:00 US/Canada Eastern
Subject: Re: [PERFORM] Identical query slower on 8.4 vs 8.3

your plans are identical as is the runtime basically.  this means you
might want to consider the following possibilities:
*) operator error :-)
*) cache effects
*) environmental factors on the server at the time
*) network/client issues

I say network issues because if your explain analyze (which actually
does run the entire query) is significantly faster than the full
query, then we have to consider that the formatting and transfer of
the data back to the client (even if it's on the same box) becomes
suspicious.  If you've eliminated other possibilities, try running
other big, trivially planned, mucho result returning queries (like
select * from table) on both servers and comparing times.

merlin
От:
"Joshua D. Drake"
Дата:

On Thu, 2010-07-15 at 10:41 -0400, Patrick Donlin wrote:

> Results when running on the v8.3.7 server....
> Total query runtime: 32185 ms.
> 700536 rows retrieved.
>
> Results when running on the v8.4.4 server....
> Total query runtime: 164227 ms.
> 700536 rows retrieved.
>

>
> Anyone have any ideas on where I should start looking to figure this
> out? I didn't perform any special steps when moving to v8.4, I just
> did a pg_dump from the 8.3 server and restored it on the new 8.4
> servers. Maybe that is where I made a mistake.

Three immediate things come to mind:

1. One had relations in file or shared buffer cache, the other didn't
2. One is running ext4 versus ext3 and when you end up spilling to disk
when you over run work_mem, the ext4 machine is faster, but without
knowing which machine is which it is a bit tough to diagnose.
3. You didn't run ANALYZE on one of the machines

Sincerely,

Joshua D. Drake

>
> Thanks!
> Patrick
>

--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering


От:
"Igor Neyman"
Дата:


> -----Original Message-----
> From: Patrick Donlin [mailto:]
> Sent: Thursday, July 15, 2010 11:13 AM
> To: Kevin Grittner; 
> Subject: Re: Identical query slower on 8.4 vs 8.3
>
> I'll read over that wiki entry, but for now here is the
> EXPLAIN ANALYZE output assuming I did it correctly. I have
> run vacuumdb --full --analyze,  it actually runs as a nightly
> cron job.
>
> 8.4.4 Sever:
> "Unique  (cost=202950.82..227521.59 rows=702022 width=86)
> (actual time=21273.371..22429.511 rows=700536 loops=1)"
> "  ->  Sort  (cost=202950.82..204705.87 rows=702022 width=86)
> (actual time=21273.368..22015.948 rows=700536 loops=1)"
> "        Sort Key: test.tid, testresult.trscore,
> testresult.trpossiblescore, testresult.trstart,
> testresult.trfinish, testresult.trscorebreakdown,
> testresult.fk_sid, test.tname, qr.qrscore, qr.qrtotalscore,
> testresult.trid, qr.qrid"
> "        Sort Method:  external merge  Disk: 71768kB"
> "        ->  Hash Join  (cost=2300.82..34001.42 rows=702022
> width=86) (actual time=64.388..1177.468 rows=700536 loops=1)"
> "              Hash Cond: (qr.fk_trid = testresult.trid)"
> "              ->  Seq Scan on questionresult qr
> (cost=0.00..12182.22 rows=702022 width=16) (actual
> time=0.090..275.518 rows=702022 loops=1)"
> "              ->  Hash  (cost=1552.97..1552.97 rows=29668
> width=74) (actual time=63.042..63.042 rows=29515 loops=1)"
> "                    ->  Hash Join  (cost=3.35..1552.97
> rows=29668 width=74) (actual time=0.227..39.111 rows=29515 loops=1)"
> "                          Hash Cond: (testresult.fk_tid = test.tid)"
> "                          ->  Seq Scan on testresult
> (cost=0.00..1141.68 rows=29668 width=53) (actual
> time=0.019..15.622 rows=29668 loops=1)"
> "                          ->  Hash  (cost=2.60..2.60 rows=60
> width=21) (actual time=0.088..0.088 rows=60 loops=1)"
> "                                ->  Seq Scan on test
> (cost=0.00..2.60 rows=60 width=21) (actual time=0.015..0.044
> rows=60 loops=1)"
> "Total runtime: 22528.820 ms"
>
> 8.3.7 Server:
> "Unique  (cost=202950.82..227521.59 rows=702022 width=86)
> (actual time=22157.714..23343.461 rows=700536 loops=1)"
> "  ->  Sort  (cost=202950.82..204705.87 rows=702022 width=86)
> (actual time=22157.706..22942.018 rows=700536 loops=1)"
> "        Sort Key: test.tid, testresult.trscore,
> testresult.trpossiblescore, testresult.trstart,
> testresult.trfinish, testresult.trscorebreakdown,
> testresult.fk_sid, test.tname, qr.qrscore, qr.qrtotalscore,
> testresult.trid, qr.qrid"
> "        Sort Method:  external merge  Disk: 75864kB"
> "        ->  Hash Join  (cost=2300.82..34001.42 rows=702022
> width=86) (actual time=72.842..1276.634 rows=700536 loops=1)"
> "              Hash Cond: (qr.fk_trid = testresult.trid)"
> "              ->  Seq Scan on questionresult qr
> (cost=0.00..12182.22 rows=702022 width=16) (actual
> time=0.112..229.987 rows=702022 loops=1)"
> "              ->  Hash  (cost=1552.97..1552.97 rows=29668
> width=74) (actual time=71.421..71.421 rows=29515 loops=1)"
> "                    ->  Hash Join  (cost=3.35..1552.97
> rows=29668 width=74) (actual time=0.398..44.524 rows=29515 loops=1)"
> "                          Hash Cond: (testresult.fk_tid = test.tid)"
> "                          ->  Seq Scan on testresult
> (cost=0.00..1141.68 rows=29668 width=53) (actual
> time=0.117..20.890 rows=29668 loops=1)"
> "                          ->  Hash  (cost=2.60..2.60 rows=60
> width=21) (actual time=0.112..0.112 rows=60 loops=1)"
> "                                ->  Seq Scan on test
> (cost=0.00..2.60 rows=60 width=21) (actual time=0.035..0.069
> rows=60 loops=1)"
> "Total runtime: 23462.639 ms"
>
>
> Thanks for the quick responses and being patient with me not
> providing enough information.
> -Patrick
>

Well, now that you've got similar runtime on both 8.4.4 and 8.3.7, here
is a suggestion to improve performance of this query based on EXPLAIN
ANALYZE you proveded (should have done it in your first e-mail).

EXPLAIN ANALYZE shows that most of the time (22015 ms on 8.4.4) spent on
sorting you result set.
And according to this: "Sort Method:  external merge  Disk: 71768kB" -
sorting is done using disk, meaning your work_mem setting is not
sufficient to do this sort in memory (I didn't go back through this
thread far enough, to see if you provided info on how it is set).

I'd suggest to increase the value up to ~80MB, if not for the system,
may be just for the session running this query.
Then see if performance improved.

And, with query performance issues always start with EXPLAIN ANALYZE.

Regards,
Igor Neyman

От:
tv@fuzzy.cz
Дата:

> I'd suggest to increase the value up to ~80MB, if not for the system,
> may be just for the session running this query.
> Then see if performance improved.

Don't forget you can do this for the given query without affecting the
other queries - just do something like

SET work_mem = 128M

and then run the query - it should work fine. This is great for testing
and to set environment for special users (batch processes etc.).

regards
Tomas