Обсуждение: BUG #5294: Sorts on more than just the order-by clause
The following bug has been logged online:
Bug reference: 5294
Logged by: Allen Johnson
Email address: akjohnson78@gmail.com
PostgreSQL version: 8.4.2
Operating system: Red Hat ES 5.4
Description: Sorts on more than just the order-by clause
Details:
I've been porting our app from Oracle to Postgres and keeping an eye on
performance. Largely, Postgres is performing just as well or better!
I did run into an issue where we are performing a group-by on about 10
columns followed by an order-by of about 5 columns. This query was taking
twice as long as Oracle. When looking at the explain plan, Postgres seems to
be using all the columns in the group-by for sorting instead of _only_ using
what is in the order-by.
While the results are correct this seems to be causing a performance problem
since Postgres is sorting on more columns than it is being asked to. I
reworked the query to get rid of the extra sorting columns but this seems
like a hack. Below is an example query followed by the ugly hack.
Note: The execution times in this example don't mean anything because they
are running on a blank test db. On the production database there was a huge
difference in execution time in favor of the hack query. I just wanted to
illustrate that the sort keys seem incorrect.
Example Query:
select
ct.name, c.lname, c.fname, c.mname,
c.email, c.address1, c.address2,
c.city, c.state, c.zip_code,
count(a.id)
from
contacts c
inner join contact_types ct on (ct.code = c.contact_type_code)
left join attachments a on (a.contact_id = c.id)
where
c.company_id = 1
group by
ct.name, c.lname, c.fname, c.mname,
c.email, c.address1, c.address2,
c.city, c.state, c.zip_code
order by
ct.name, c.lname, c.fname, c.mname;
Example Explain:
QUERY
PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
GroupAggregate (cost=27.98..28.11 rows=3 width=1864) (actual
time=0.037..0.037 rows=0 loops=1)
-> Sort (cost=27.98..27.99 rows=3 width=1864) (actual time=0.035..0.035
rows=0 loops=1)
Sort Key: ct.name, c.lname, c.fname, c.mname, c.email, c.address1,
c.address2, c.city, c.state, c.zip_code
Sort Method: quicksort Memory: 17kB
-> Nested Loop Left Join (cost=4.27..27.96 rows=3 width=1864)
(actual time=0.017..0.017 rows=0 loops=1)
-> Nested Loop (cost=0.00..16.55 rows=1 width=1864) (actual
time=0.014..0.014 rows=0 loops=1)
-> Index Scan using contacts_company on contacts c
(cost=0.00..8.27 rows=1 width=1760) (actual time=0.012..0.012 rows=0
loops=1)
Index Cond: (company_id = 1)
-> Index Scan using contact_types_pkey on
contact_types ct (cost=0.00..8.27 rows=1 width=152) (never executed)
Index Cond: ((ct.code)::text =
(c.contact_type_code)::text)
-> Bitmap Heap Scan on attachments a (cost=4.27..11.37
rows=3 width=12) (never executed)
Recheck Cond: (a.contact_id = c.id)
-> Bitmap Index Scan on attachments_contact
(cost=0.00..4.27 rows=3 width=0) (never executed)
Index Cond: (a.contact_id = c.id)
Total runtime: 0.192 ms
(15 rows)
* Notice how the sort key is using many more columns than the order-by has
specified.
Hack Query:
select * from (
select
ct.name as ct_name, c.lname, c.fname, c.mname,
c.email, c.address1, c.address2,
c.city, c.state, c.zip_code,
count(a.id)
from
contacts c
inner join contact_types ct on (ct.code = c.contact_type_code)
left join attachments a on (a.contact_id = c.id)
where
c.company_id = 1
group by
ct.name, c.lname, c.fname, c.mname,
c.email, c.address1, c.address2,
c.city, c.state, c.zip_code
) as results
order by
ct_name, lname, fname, mname;
Hack Explain:
QUERY
PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
Sort (cost=28.13..28.14 rows=3 width=1868) (actual time=0.054..0.054
rows=0 loops=1)
Sort Key: ct.name, c.lname, c.fname, c.mname
Sort Method: quicksort Memory: 17kB
-> HashAggregate (cost=28.04..28.08 rows=3 width=1864) (actual
time=0.021..0.021 rows=0 loops=1)
-> Nested Loop Left Join (cost=4.27..27.96 rows=3 width=1864)
(actual time=0.018..0.018 rows=0 loops=1)
-> Nested Loop (cost=0.00..16.55 rows=1 width=1864) (actual
time=0.016..0.016 rows=0 loops=1)
-> Index Scan using contacts_company on contacts c
(cost=0.00..8.27 rows=1 width=1760) (actual time=0.013..0.013 rows=0
loops=1)
Index Cond: (company_id = 1)
-> Index Scan using contact_types_pkey on
contact_types ct (cost=0.00..8.27 rows=1 width=152) (never executed)
Index Cond: ((ct.code)::text =
(c.contact_type_code)::text)
-> Bitmap Heap Scan on attachments a (cost=4.27..11.37
rows=3 width=12) (never executed)
Recheck Cond: (a.contact_id = c.id)
-> Bitmap Index Scan on attachments_contact
(cost=0.00..4.27 rows=3 width=0) (never executed)
Index Cond: (a.contact_id = c.id)
Total runtime: 0.259 ms
(15 rows)
"Allen Johnson" <akjohnson78@gmail.com> writes:
> I did run into an issue where we are performing a group-by on about 10
> columns followed by an order-by of about 5 columns. This query was taking
> twice as long as Oracle. When looking at the explain plan, Postgres seems to
> be using all the columns in the group-by for sorting instead of _only_ using
> what is in the order-by.
If it's using a sort-uniq type of plan (which it is), it has to sort
by all the grouping columns, else the results won't be correct.
In practice, I really doubt this would make a measurable performance
difference, since most row comparisons would arrive at a result before
they got to the lowest-order columns.
I think your gripe may actually have to do with a misestimate of the
relative costs of hash- and sort-based grouping, but analyze results
on a toy example don't illuminate that sort of problem at all :-(
regards, tom lane
> In practice, I really doubt this would make a measurable performance > difference, since most row comparisons would arrive at a result before > they got to the lowest-order columns. > > I think your gripe may actually have to do with a misestimate of the > relative costs of hash- and sort-based grouping, but analyze results > on a toy example don't illuminate that sort of problem at all :-( Yes, this toy example doesn't show how much time was spent on the actual sorting (of the production data, obviously). What I can do is assemble a test database with similar amount of data and repost the `explain analyze` from that if there is any interest. What I noticed in the production query was that ~1000ms was spent on sorting alone. The hack query reduced that to ~400ms. I should also note that there was plenty of work_mem and that the sort was not hitting disk. I should be able to get that going sometime early tomorrow. All I'm going to do is generate a lot of contacts by randomly choosing from a set of lastnames, firstnames, etc as well as randomly insert some number of attachments for each. I'm open to any suggestions on testing methodologies. AJ
On Thu, Jan 21, 2010 at 9:27 PM, Allen Johnson <akjohnson78@gmail.com> wrote: > What I noticed in the production query was that ~1000ms was spent on > sorting alone. The hack query reduced that to ~400ms. I should also > note that there was plenty of work_mem and that the sort was not > hitting disk. > The "hack" form actually seems far worse to me but I guess it depends on the actual data layout. Have you tried something like this which I would expect to be far faster: select ct.name, c.lname, c.fname, c.mname, c.email, c.address1, c.address2, c.city, c.state, c.zip_code, (select count(*) from attachments where attachments.contact_id = c.id) from contacts c inner join contact_types ct on (ct.code = c.contact_type_code) where c.company_id = 1 order by ct.name, c.lname, c.fname, c.mname; The question arises why Postgres can't automatically detect that this query is equivalent. That might come when we start implementing the "functional dependency" stuff from the standard and can determine that the group by list uniquely identifies a row from the first join. Currently we don't do that kind of analysis. -- greg
On Fri, Jan 22, 2010 at 2:02 PM, Greg Stark <gsstark@mit.edu> wrote: > The question arises why Postgres can't automatically detect that this > query is equivalent. Hm, actually rereading your query it's not technically equivalent. Since you weren't grouping on contact.id or contact_type.code if you happened to have multiple contacts that had the same name and address and multiple codes which had the same name then your query would consolidate them into one result record. You might know that will never happen but the database can't prove that's true so it would never be able to do this transform. -- greg
Ok, I've generated a test database with:
* 20,000 users
* 250,000 contacts
* 1,124,700 attachments
The summary of the results is that the normal query takes about 32sec
on my machine. The hack query takes about 13sec.
Below are the queries and their `explain analyze` outputs. I've
attached the test layout as well as a zip file containing the ruby
scripts that generate the data in the default format that the 'copy'
command expects.
If anyone else wants to give it a try this is the procedure. I wrote
these scripts in a hurry so I'm sure there could be many improvements
:)
1. Apply the tables.sql file to a test database (might want to run the
create index commands after the data load)
2. Create the data files using scripts in the create-scripts.zip like this:
ruby create-user.rb ; creates /tmp/users
ruby create-contacts.rb ; creates /tmp/contacts
ruby create-attachments.rb ; creates /tmp/attachments
3. Load data into the test database
copy users from '/tmp/users';
copy contacts from '/tmp/contacts';
copy attachments from '/tmp/attachments';
4. Run the `create index` statements in the tables.sql file
I also have a pg_dump version if anyone wants it (~6MB gzipped).
/* NORMAL QUERY */
select
users.id, users.full_name, users.username,
ct.name as type_name,
c.lname, c.fname, c.mname, c.email,
c.address1, c.city, c.state, c.zip_code,
c.created_date,
count(a.id) as attachment_count
from
contacts c
inner join users on (users.id = c.user_id)
inner join contact_types ct on (ct.code = c.contact_type_code)
left join attachments a on (a.contact_id = c.id)
where
users.company_id = 1
and c.contact_type_code in ('BOSS', 'EMP', 'WORK')
group by
users.id, users.full_name, users.username,
ct.name, c.lname, c.fname, c.mname, c.email,
c.address1, c.city, c.state, c.zip_code,
c.created_date
order by
users.full_name,
ct.name, c.lname, c.fname, c.mname,
c.created_date
EXPLAIN ANALYZE OUTPUT:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=102724.80..122894.80 rows=403400 width=124)
(actual time=30683.912..32431.153 rows=75228 loops=1)
-> Sort (cost=102724.80..103733.30 rows=403400 width=124) (actual
time=30683.869..31236.760 rows=345543 loops=1)
Sort Key: users.full_name, ct.name, c.lname, c.fname,
c.mname, c.created_date, users.id, users.username, c.email,
c.address1, c.city, c.state, c.zip_code
Sort Method: quicksort Memory: 96578kB
-> Merge Right Join (cost=16571.91..65164.53 rows=403400
width=124) (actual time=1946.609..7523.831 rows=345543 loops=1)
Merge Cond: (a.contact_id = c.id)
-> Index Scan using attachments_contact on attachments
a (cost=0.00..39729.87 rows=1124700 width=12) (actual
time=0.082..2150.242 rows=1124662 loops=1)
-> Sort (cost=16571.91..16732.30 rows=64157
width=124) (actual time=1946.463..2384.232 rows=345537 loops=1)
Sort Key: c.id
Sort Method: quicksort Memory: 21439kB
-> Hash Join (cost=553.92..11449.19 rows=64157
width=124) (actual time=81.120..1584.615 rows=75228 loops=1)
Hash Cond: ((c.contact_type_code)::text =
(ct.code)::text)
-> Hash Join (cost=552.81..10565.92
rows=64157 width=121) (actual time=81.063..1285.727 rows=75228
loops=1)
Hash Cond: (c.user_id = users.id)
-> Seq Scan on contacts c
(cost=0.00..7534.50 rows=122469 width=96) (actual time=0.023..526.259
rows=150077 loops=1)
Filter:
((contact_type_code)::text = ANY ('{BOSS,EMP,WORK}'::text[]))
-> Hash (cost=427.00..427.00
rows=10065 width=33) (actual time=80.974..80.974 rows=10065 loops=1)
-> Seq Scan on users
(cost=0.00..427.00 rows=10065 width=33) (actual time=0.022..37.797
rows=10065 loops=1)
Filter: (company_id = 1)
-> Hash (cost=1.05..1.05 rows=5 width=12)
(actual time=0.037..0.037 rows=5 loops=1)
-> Seq Scan on contact_types ct
(cost=0.00..1.05 rows=5 width=12) (actual time=0.018..0.024 rows=5
loops=1)
Total runtime: 32551.132 ms
(22 rows)
/* HACK QUERY */
select * from (
select
users.id, users.full_name, users.username,
ct.name as type_name,
c.lname, c.fname, c.mname, c.email,
c.address1, c.city, c.state, c.zip_code,
c.created_date,
count(a.id) as attachment_count
from
contacts c
inner join users on (users.id = c.user_id)
inner join contact_types ct on (ct.code = c.contact_type_code)
left join attachments a on (a.contact_id = c.id)
where
users.company_id = 1
and c.contact_type_code in ('BOSS', 'EMP', 'WORK')
group by
users.id, users.full_name, users.username,
ct.name, c.lname, c.fname, c.mname, c.email,
c.address1, c.city, c.state, c.zip_code,
c.created_date
) as results
order by
full_name,
type_name, lname, fname, mname,
created_date
/* EXPLAIN ANALYZE OUTPUT */
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=420987.80..421996.30 rows=403400 width=1688) (actual
time=12579.266..12663.383 rows=75228 loops=1)
Sort Key: users.full_name, ct.name, c.lname, c.fname, c.mname, c.created_date
Sort Method: quicksort Memory: 21435kB
-> HashAggregate (cost=79283.53..84326.03 rows=403400 width=124)
(actual time=9546.773..9721.322 rows=75228 loops=1)
-> Merge Right Join (cost=16571.91..65164.53 rows=403400
width=124) (actual time=1857.597..7428.718 rows=345543 loops=1)
Merge Cond: (a.contact_id = c.id)
-> Index Scan using attachments_contact on attachments
a (cost=0.00..39729.87 rows=1124700 width=12) (actual
time=0.862..2298.099 rows=1124662 loops=1)
-> Sort (cost=16571.91..16732.30 rows=64157
width=124) (actual time=1856.666..2279.646 rows=345537 loops=1)
Sort Key: c.id
Sort Method: quicksort Memory: 21439kB
-> Hash Join (cost=553.92..11449.19 rows=64157
width=124) (actual time=54.465..1500.539 rows=75228 loops=1)
Hash Cond: ((c.contact_type_code)::text =
(ct.code)::text)
-> Hash Join (cost=552.81..10565.92
rows=64157 width=121) (actual time=54.375..1210.697 rows=75228
loops=1)
Hash Cond: (c.user_id = users.id)
-> Seq Scan on contacts c
(cost=0.00..7534.50 rows=122469 width=96) (actual time=0.035..492.531
rows=150077 loops=1)
Filter:
((contact_type_code)::text = ANY ('{BOSS,EMP,WORK}'::text[]))
-> Hash (cost=427.00..427.00
rows=10065 width=33) (actual time=54.277..54.277 rows=10065 loops=1)
-> Seq Scan on users
(cost=0.00..427.00 rows=10065 width=33) (actual time=0.032..27.676
rows=10065 loops=1)
Filter: (company_id = 1)
-> Hash (cost=1.05..1.05 rows=5 width=12)
(actual time=0.055..0.055 rows=5 loops=1)
-> Seq Scan on contact_types ct
(cost=0.00..1.05 rows=5 width=12) (actual time=0.026..0.035 rows=5
loops=1)
Total runtime: 12752.540 ms
(22 rows)
Вложения
Allen Johnson <akjohnson78@gmail.com> writes:
> Ok, I've generated a test database with:
> * 20,000 users
> * 250,000 contacts
> * 1,124,700 attachments
> The summary of the results is that the normal query takes about 32sec
> on my machine. The hack query takes about 13sec.
I poked at this for a bit. At least with the test data (dunno about
your real data), the first few grouping columns are pretty nearly unique
so the "extra" sort columns really aren't affecting the runtime anyway.
I believe that the reason the hacked query is cheaper is simply that the
sort is sorting fewer rows because it's applied after aggregation
instead of beforehand.
The planner is well aware of that effect, but the reason it fails to
choose hashed aggregation is that it doesn't think the aggregation will
reduce the number of rows --- so it estimates the sort for that case as
being much more expensive than it really is. Notice that the
post-aggregation and pre-aggregation rowcount estimates are just the
same in both these queries. If I force choose_hashed_grouping() to
make the other decision, I get the same plan out of the "normal"
query as the hacked query produces.
I have an idea for improving the accuracy of the post-aggregation
rowcount estimate, which I'll post on pgsql-hackers in a bit. But
it's not something I have enough confidence in to risk back-patching.
So for the moment your hack with forcing the sort to be done separately
is probably your best answer.
regards, tom lane