Обсуждение: BUG #5294: Sorts on more than just the order-by clause
The following bug has been logged online: Bug reference: 5294 Logged by: Allen Johnson Email address: akjohnson78@gmail.com PostgreSQL version: 8.4.2 Operating system: Red Hat ES 5.4 Description: Sorts on more than just the order-by clause Details: I've been porting our app from Oracle to Postgres and keeping an eye on performance. Largely, Postgres is performing just as well or better! I did run into an issue where we are performing a group-by on about 10 columns followed by an order-by of about 5 columns. This query was taking twice as long as Oracle. When looking at the explain plan, Postgres seems to be using all the columns in the group-by for sorting instead of _only_ using what is in the order-by. While the results are correct this seems to be causing a performance problem since Postgres is sorting on more columns than it is being asked to. I reworked the query to get rid of the extra sorting columns but this seems like a hack. Below is an example query followed by the ugly hack. Note: The execution times in this example don't mean anything because they are running on a blank test db. On the production database there was a huge difference in execution time in favor of the hack query. I just wanted to illustrate that the sort keys seem incorrect. Example Query: select ct.name, c.lname, c.fname, c.mname, c.email, c.address1, c.address2, c.city, c.state, c.zip_code, count(a.id) from contacts c inner join contact_types ct on (ct.code = c.contact_type_code) left join attachments a on (a.contact_id = c.id) where c.company_id = 1 group by ct.name, c.lname, c.fname, c.mname, c.email, c.address1, c.address2, c.city, c.state, c.zip_code order by ct.name, c.lname, c.fname, c.mname; Example Explain: QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- GroupAggregate (cost=27.98..28.11 rows=3 width=1864) (actual time=0.037..0.037 rows=0 loops=1) -> Sort (cost=27.98..27.99 rows=3 width=1864) (actual time=0.035..0.035 rows=0 loops=1) Sort Key: ct.name, c.lname, c.fname, c.mname, c.email, c.address1, c.address2, c.city, c.state, c.zip_code Sort Method: quicksort Memory: 17kB -> Nested Loop Left Join (cost=4.27..27.96 rows=3 width=1864) (actual time=0.017..0.017 rows=0 loops=1) -> Nested Loop (cost=0.00..16.55 rows=1 width=1864) (actual time=0.014..0.014 rows=0 loops=1) -> Index Scan using contacts_company on contacts c (cost=0.00..8.27 rows=1 width=1760) (actual time=0.012..0.012 rows=0 loops=1) Index Cond: (company_id = 1) -> Index Scan using contact_types_pkey on contact_types ct (cost=0.00..8.27 rows=1 width=152) (never executed) Index Cond: ((ct.code)::text = (c.contact_type_code)::text) -> Bitmap Heap Scan on attachments a (cost=4.27..11.37 rows=3 width=12) (never executed) Recheck Cond: (a.contact_id = c.id) -> Bitmap Index Scan on attachments_contact (cost=0.00..4.27 rows=3 width=0) (never executed) Index Cond: (a.contact_id = c.id) Total runtime: 0.192 ms (15 rows) * Notice how the sort key is using many more columns than the order-by has specified. Hack Query: select * from ( select ct.name as ct_name, c.lname, c.fname, c.mname, c.email, c.address1, c.address2, c.city, c.state, c.zip_code, count(a.id) from contacts c inner join contact_types ct on (ct.code = c.contact_type_code) left join attachments a on (a.contact_id = c.id) where c.company_id = 1 group by ct.name, c.lname, c.fname, c.mname, c.email, c.address1, c.address2, c.city, c.state, c.zip_code ) as results order by ct_name, lname, fname, mname; Hack Explain: QUERY PLAN ---------------------------------------------------------------------------- ---------------------------------------------------------------------------- Sort (cost=28.13..28.14 rows=3 width=1868) (actual time=0.054..0.054 rows=0 loops=1) Sort Key: ct.name, c.lname, c.fname, c.mname Sort Method: quicksort Memory: 17kB -> HashAggregate (cost=28.04..28.08 rows=3 width=1864) (actual time=0.021..0.021 rows=0 loops=1) -> Nested Loop Left Join (cost=4.27..27.96 rows=3 width=1864) (actual time=0.018..0.018 rows=0 loops=1) -> Nested Loop (cost=0.00..16.55 rows=1 width=1864) (actual time=0.016..0.016 rows=0 loops=1) -> Index Scan using contacts_company on contacts c (cost=0.00..8.27 rows=1 width=1760) (actual time=0.013..0.013 rows=0 loops=1) Index Cond: (company_id = 1) -> Index Scan using contact_types_pkey on contact_types ct (cost=0.00..8.27 rows=1 width=152) (never executed) Index Cond: ((ct.code)::text = (c.contact_type_code)::text) -> Bitmap Heap Scan on attachments a (cost=4.27..11.37 rows=3 width=12) (never executed) Recheck Cond: (a.contact_id = c.id) -> Bitmap Index Scan on attachments_contact (cost=0.00..4.27 rows=3 width=0) (never executed) Index Cond: (a.contact_id = c.id) Total runtime: 0.259 ms (15 rows)
"Allen Johnson" <akjohnson78@gmail.com> writes: > I did run into an issue where we are performing a group-by on about 10 > columns followed by an order-by of about 5 columns. This query was taking > twice as long as Oracle. When looking at the explain plan, Postgres seems to > be using all the columns in the group-by for sorting instead of _only_ using > what is in the order-by. If it's using a sort-uniq type of plan (which it is), it has to sort by all the grouping columns, else the results won't be correct. In practice, I really doubt this would make a measurable performance difference, since most row comparisons would arrive at a result before they got to the lowest-order columns. I think your gripe may actually have to do with a misestimate of the relative costs of hash- and sort-based grouping, but analyze results on a toy example don't illuminate that sort of problem at all :-( regards, tom lane
> In practice, I really doubt this would make a measurable performance > difference, since most row comparisons would arrive at a result before > they got to the lowest-order columns. > > I think your gripe may actually have to do with a misestimate of the > relative costs of hash- and sort-based grouping, but analyze results > on a toy example don't illuminate that sort of problem at all :-( Yes, this toy example doesn't show how much time was spent on the actual sorting (of the production data, obviously). What I can do is assemble a test database with similar amount of data and repost the `explain analyze` from that if there is any interest. What I noticed in the production query was that ~1000ms was spent on sorting alone. The hack query reduced that to ~400ms. I should also note that there was plenty of work_mem and that the sort was not hitting disk. I should be able to get that going sometime early tomorrow. All I'm going to do is generate a lot of contacts by randomly choosing from a set of lastnames, firstnames, etc as well as randomly insert some number of attachments for each. I'm open to any suggestions on testing methodologies. AJ
On Thu, Jan 21, 2010 at 9:27 PM, Allen Johnson <akjohnson78@gmail.com> wrote: > What I noticed in the production query was that ~1000ms was spent on > sorting alone. The hack query reduced that to ~400ms. I should also > note that there was plenty of work_mem and that the sort was not > hitting disk. > The "hack" form actually seems far worse to me but I guess it depends on the actual data layout. Have you tried something like this which I would expect to be far faster: select ct.name, c.lname, c.fname, c.mname, c.email, c.address1, c.address2, c.city, c.state, c.zip_code, (select count(*) from attachments where attachments.contact_id = c.id) from contacts c inner join contact_types ct on (ct.code = c.contact_type_code) where c.company_id = 1 order by ct.name, c.lname, c.fname, c.mname; The question arises why Postgres can't automatically detect that this query is equivalent. That might come when we start implementing the "functional dependency" stuff from the standard and can determine that the group by list uniquely identifies a row from the first join. Currently we don't do that kind of analysis. -- greg
On Fri, Jan 22, 2010 at 2:02 PM, Greg Stark <gsstark@mit.edu> wrote: > The question arises why Postgres can't automatically detect that this > query is equivalent. Hm, actually rereading your query it's not technically equivalent. Since you weren't grouping on contact.id or contact_type.code if you happened to have multiple contacts that had the same name and address and multiple codes which had the same name then your query would consolidate them into one result record. You might know that will never happen but the database can't prove that's true so it would never be able to do this transform. -- greg
Ok, I've generated a test database with: * 20,000 users * 250,000 contacts * 1,124,700 attachments The summary of the results is that the normal query takes about 32sec on my machine. The hack query takes about 13sec. Below are the queries and their `explain analyze` outputs. I've attached the test layout as well as a zip file containing the ruby scripts that generate the data in the default format that the 'copy' command expects. If anyone else wants to give it a try this is the procedure. I wrote these scripts in a hurry so I'm sure there could be many improvements :) 1. Apply the tables.sql file to a test database (might want to run the create index commands after the data load) 2. Create the data files using scripts in the create-scripts.zip like this: ruby create-user.rb ; creates /tmp/users ruby create-contacts.rb ; creates /tmp/contacts ruby create-attachments.rb ; creates /tmp/attachments 3. Load data into the test database copy users from '/tmp/users'; copy contacts from '/tmp/contacts'; copy attachments from '/tmp/attachments'; 4. Run the `create index` statements in the tables.sql file I also have a pg_dump version if anyone wants it (~6MB gzipped). /* NORMAL QUERY */ select users.id, users.full_name, users.username, ct.name as type_name, c.lname, c.fname, c.mname, c.email, c.address1, c.city, c.state, c.zip_code, c.created_date, count(a.id) as attachment_count from contacts c inner join users on (users.id = c.user_id) inner join contact_types ct on (ct.code = c.contact_type_code) left join attachments a on (a.contact_id = c.id) where users.company_id = 1 and c.contact_type_code in ('BOSS', 'EMP', 'WORK') group by users.id, users.full_name, users.username, ct.name, c.lname, c.fname, c.mname, c.email, c.address1, c.city, c.state, c.zip_code, c.created_date order by users.full_name, ct.name, c.lname, c.fname, c.mname, c.created_date EXPLAIN ANALYZE OUTPUT: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- GroupAggregate (cost=102724.80..122894.80 rows=403400 width=124) (actual time=30683.912..32431.153 rows=75228 loops=1) -> Sort (cost=102724.80..103733.30 rows=403400 width=124) (actual time=30683.869..31236.760 rows=345543 loops=1) Sort Key: users.full_name, ct.name, c.lname, c.fname, c.mname, c.created_date, users.id, users.username, c.email, c.address1, c.city, c.state, c.zip_code Sort Method: quicksort Memory: 96578kB -> Merge Right Join (cost=16571.91..65164.53 rows=403400 width=124) (actual time=1946.609..7523.831 rows=345543 loops=1) Merge Cond: (a.contact_id = c.id) -> Index Scan using attachments_contact on attachments a (cost=0.00..39729.87 rows=1124700 width=12) (actual time=0.082..2150.242 rows=1124662 loops=1) -> Sort (cost=16571.91..16732.30 rows=64157 width=124) (actual time=1946.463..2384.232 rows=345537 loops=1) Sort Key: c.id Sort Method: quicksort Memory: 21439kB -> Hash Join (cost=553.92..11449.19 rows=64157 width=124) (actual time=81.120..1584.615 rows=75228 loops=1) Hash Cond: ((c.contact_type_code)::text = (ct.code)::text) -> Hash Join (cost=552.81..10565.92 rows=64157 width=121) (actual time=81.063..1285.727 rows=75228 loops=1) Hash Cond: (c.user_id = users.id) -> Seq Scan on contacts c (cost=0.00..7534.50 rows=122469 width=96) (actual time=0.023..526.259 rows=150077 loops=1) Filter: ((contact_type_code)::text = ANY ('{BOSS,EMP,WORK}'::text[])) -> Hash (cost=427.00..427.00 rows=10065 width=33) (actual time=80.974..80.974 rows=10065 loops=1) -> Seq Scan on users (cost=0.00..427.00 rows=10065 width=33) (actual time=0.022..37.797 rows=10065 loops=1) Filter: (company_id = 1) -> Hash (cost=1.05..1.05 rows=5 width=12) (actual time=0.037..0.037 rows=5 loops=1) -> Seq Scan on contact_types ct (cost=0.00..1.05 rows=5 width=12) (actual time=0.018..0.024 rows=5 loops=1) Total runtime: 32551.132 ms (22 rows) /* HACK QUERY */ select * from ( select users.id, users.full_name, users.username, ct.name as type_name, c.lname, c.fname, c.mname, c.email, c.address1, c.city, c.state, c.zip_code, c.created_date, count(a.id) as attachment_count from contacts c inner join users on (users.id = c.user_id) inner join contact_types ct on (ct.code = c.contact_type_code) left join attachments a on (a.contact_id = c.id) where users.company_id = 1 and c.contact_type_code in ('BOSS', 'EMP', 'WORK') group by users.id, users.full_name, users.username, ct.name, c.lname, c.fname, c.mname, c.email, c.address1, c.city, c.state, c.zip_code, c.created_date ) as results order by full_name, type_name, lname, fname, mname, created_date /* EXPLAIN ANALYZE OUTPUT */ QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=420987.80..421996.30 rows=403400 width=1688) (actual time=12579.266..12663.383 rows=75228 loops=1) Sort Key: users.full_name, ct.name, c.lname, c.fname, c.mname, c.created_date Sort Method: quicksort Memory: 21435kB -> HashAggregate (cost=79283.53..84326.03 rows=403400 width=124) (actual time=9546.773..9721.322 rows=75228 loops=1) -> Merge Right Join (cost=16571.91..65164.53 rows=403400 width=124) (actual time=1857.597..7428.718 rows=345543 loops=1) Merge Cond: (a.contact_id = c.id) -> Index Scan using attachments_contact on attachments a (cost=0.00..39729.87 rows=1124700 width=12) (actual time=0.862..2298.099 rows=1124662 loops=1) -> Sort (cost=16571.91..16732.30 rows=64157 width=124) (actual time=1856.666..2279.646 rows=345537 loops=1) Sort Key: c.id Sort Method: quicksort Memory: 21439kB -> Hash Join (cost=553.92..11449.19 rows=64157 width=124) (actual time=54.465..1500.539 rows=75228 loops=1) Hash Cond: ((c.contact_type_code)::text = (ct.code)::text) -> Hash Join (cost=552.81..10565.92 rows=64157 width=121) (actual time=54.375..1210.697 rows=75228 loops=1) Hash Cond: (c.user_id = users.id) -> Seq Scan on contacts c (cost=0.00..7534.50 rows=122469 width=96) (actual time=0.035..492.531 rows=150077 loops=1) Filter: ((contact_type_code)::text = ANY ('{BOSS,EMP,WORK}'::text[])) -> Hash (cost=427.00..427.00 rows=10065 width=33) (actual time=54.277..54.277 rows=10065 loops=1) -> Seq Scan on users (cost=0.00..427.00 rows=10065 width=33) (actual time=0.032..27.676 rows=10065 loops=1) Filter: (company_id = 1) -> Hash (cost=1.05..1.05 rows=5 width=12) (actual time=0.055..0.055 rows=5 loops=1) -> Seq Scan on contact_types ct (cost=0.00..1.05 rows=5 width=12) (actual time=0.026..0.035 rows=5 loops=1) Total runtime: 12752.540 ms (22 rows)
Вложения
Allen Johnson <akjohnson78@gmail.com> writes: > Ok, I've generated a test database with: > * 20,000 users > * 250,000 contacts > * 1,124,700 attachments > The summary of the results is that the normal query takes about 32sec > on my machine. The hack query takes about 13sec. I poked at this for a bit. At least with the test data (dunno about your real data), the first few grouping columns are pretty nearly unique so the "extra" sort columns really aren't affecting the runtime anyway. I believe that the reason the hacked query is cheaper is simply that the sort is sorting fewer rows because it's applied after aggregation instead of beforehand. The planner is well aware of that effect, but the reason it fails to choose hashed aggregation is that it doesn't think the aggregation will reduce the number of rows --- so it estimates the sort for that case as being much more expensive than it really is. Notice that the post-aggregation and pre-aggregation rowcount estimates are just the same in both these queries. If I force choose_hashed_grouping() to make the other decision, I get the same plan out of the "normal" query as the hacked query produces. I have an idea for improving the accuracy of the post-aggregation rowcount estimate, which I'll post on pgsql-hackers in a bit. But it's not something I have enough confidence in to risk back-patching. So for the moment your hack with forcing the sort to be done separately is probably your best answer. regards, tom lane