Обсуждение: enable_XXX options
I have a query in my application that takes an unreasonable amount of time to complete (>4.5 hours execution time). After reviewing the EXPLAIN and EXPLAIN ANALYZE output for that and similar queries, my colleagues and I determined that turning off the enable_nestloop option might help – we noticed dramatic speed improvements for that specific query after doing so (<2 minutes execution time). I was warned not to mess with the enable_XXX options in a production environment, but does anyone see any problem with turning off the enable_nestloop option right before executing my query and turning it back on afterwards?
Bjorn Peterson
Software Engineer
Pearson School Technologies
Bloomington, MN
(952) 681-3384
****************************************************************************
This email may contain confidential
material. If you were not an intended recipient,
Please notify the sender and delete all copies.
We may monitor email to and from our network.
***************************************************************************
"Peterson, Bjorn" <Bjorn.Peterson@pearson.com> writes: > I have a query in my application that takes an unreasonable amount of time > to complete (>4.5 hours execution time). After reviewing the EXPLAIN and > EXPLAIN ANALYZE output for that and similar queries, my colleagues and I > determined that turning off the enable_nestloop option might help - we > noticed dramatic speed improvements for that specific query after doing so > (<2 minutes execution time). I was warned not to mess with the enable_XXX > options in a production environment, but does anyone see any problem with > turning off the enable_nestloop option right before executing my query and > turning it back on afterwards? That's what it's there for ... but it would be useful to look into why the planner gets it so wrong without that hint. Could we see EXPLAIN ANALYZE both ways? regards, tom lane
-----Original Message----- >From: Tom Lane [mailto:tgl@sss.pgh.pa.us] >Sent: Friday, April 01, 2005 10:37 AM >To: Peterson, Bjorn >Cc: pgsql-performance@postgresql.org >Subject: Re: [PERFORM] enable_XXX options > >"Peterson, Bjorn" <Bjorn.Peterson@pearson.com> writes: >> I have a query in my application that takes an unreasonable amount of time >> to complete (>4.5 hours execution time). After reviewing the EXPLAIN and >> EXPLAIN ANALYZE output for that and similar queries, my colleagues and I >> determined that turning off the enable_nestloop option might help - we >> noticed dramatic speed improvements for that specific query after doing so >> (<2 minutes execution time). I was warned not to mess with the enable_XXX >> options in a production environment, but does anyone see any problem with >> turning off the enable_nestloop option right before executing my query and >> turning it back on afterwards? > >That's what it's there for ... but it would be useful to look into why >the planner gets it so wrong without that hint. Could we see EXPLAIN >ANALYZE both ways? > > regards, tom lane > Below is my query and the output of EXPLAIN - I was not able to run EXPLAIN ANALYZE, as the query never completes unless we turn enable_nestloop off: SELECT t.term_id AS term_id, a.user_id AS user_id, a.time_slot AS course_id, a.attendance_status AS status, SUM(CASE WHEN a.attendance_date>=t.start_date THEN 1 ELSE 0 END) AS cur_total, COUNT(a.attendance_date) AS ytd_total FROM "Attendance" a, "Terms" t, "Terms" ytd, "CoursesUsers" cu, "Courses" c, "CoursesOffered" co, "Schools" s WHERE a.attendance_type=1 AND a.attendance_status IN(3,4,1,2) AND a.attendance_date>=ytd.start_date AND a.attendance_date<=t.end_date AND a.attendance_date<=now() AND a.user_id=cu.user_id AND a.time_slot=cu.course_id AND cu.course_id=c.course_id AND co.course_offered_id=c.course_offered_id AND co.school_id=s.school_id AND s.district_id=2 AND ytd.term_id=t.top_term_id AND c.course_id IN (221,395,244,394,366,370,400,11,373,369,392,406,398,381,391,393,403,376,220, 846,440,935,910,431,428,904,905,222,201,453,913,1794,408,901,856,424,443,175 0,452,461,462,471,463,911,489,821,916,501,223) GROUP BY a.user_id, a.time_slot, t.term_id, a.attendance_status ORDER BY a.user_id, a.time_slot, t.term_id, a.attendance_status The Attendance table is the largest (about 2 million records), Terms has about 50 records, CoursesUsers has about 30,000 records, Courses has about 2000 records, CoursesOffered has about 1000 records, and Schools has 3 records. The purpose of this query is to retrieve the number of absences for each student/course/term combination - we need separate totals for year-to-date (from the start of the school year), and for absences only within the current term. Every field referenced in the WHERE clause has an appropriate single or multi-column index. We are using the standard PostgreSQL JDBC driver and the only parameter being set in this query is the district_id (s.district_id=2). We are running Postgres 8.0.1 on a Windows 2000 server. With enable_nestloop on (default): QUERY PLAN GroupAggregate (cost=4674.63..4677.13 rows=100 width=22) -> Sort (cost=4674.63..4674.88 rows=100 width=22) Sort Key: a.user_id -> Nested Loop (cost=276.69..4671.30 rows=100 width=22) Join Filter: (("outer".attendance_date <= "inner".end_date) AND ("outer".attendance_date >= "inner".start_date)) -> Hash Join (cost=273.30..4649.92 rows=20 width=14) Hash Cond: ("outer".school_id = "inner".school_id) -> Nested Loop (cost=272.26..4648.50 rows=25 width=18) -> Hash Join (cost=272.26..986.69 rows=836 width=16) Hash Cond: ("outer".course_offered_id = "inner".course_offered_id) -> Hash Join (cost=246.81..948.70 rows=836 width=16) Hash Cond: ("outer".course_id = "inner".course_id) -> Seq Scan on "CoursesUsers" cu (cost=0.00..545.02 rows=29702 width=8) -> Hash (cost=246.68..246.68 rows=49 width=8) -> Seq Scan on "Courses" c (cost=0.00..246.68 rows=49 width=8) Filter: ((course_id = 221) OR (course_id = 395) OR (course_id = 244) OR (course_id = 394) OR (course_id = 366) OR (course_id = 370) OR (course_id = 400) OR (course_id = 11) OR (course_id = 373) OR (course_i (..) -> Hash (cost=23.36..23.36 rows=836 width=8) -> Seq Scan on "CoursesOffered" co (cost=0.00..23.36 rows=836 width=8) -> Index Scan using "Attendance_pkey" on "Attendance" a (cost=0.00..4.37 rows=1 width=14) Index Cond: ((a.attendance_date <= now()) AND (a.attendance_type = 1) AND ("outer".course_id = a.time_slot) AND (a.user_id = "outer".user_id)) Filter: ((attendance_status = 3) OR (attendance_status = 4) OR (attendance_status = 1) OR (attendance_status = 2)) -> Hash (cost=1.04..1.04 rows=3 width=4) -> Seq Scan on "Schools" s (cost=0.00..1.04 rows=3 width=4) Filter: (district_id = 2) -> Materialize (cost=3.39..3.75 rows=36 width=16) -> Hash Join (cost=1.45..3.35 rows=36 width=16) Hash Cond: ("outer".top_term_id = "inner".term_id) -> Seq Scan on "Terms" t (cost=0.00..1.36 rows=36 width=16) -> Hash (cost=1.36..1.36 rows=36 width=8) -> Seq Scan on "Terms" ytd (cost=0.00..1.36 rows=36 width=8) ------------------------------------------------------------------------- After turning enable_nestloop off: QUERY PLAN GroupAggregate (cost=100078595.13..100078597.63 rows=100 width=22) -> Sort (cost=100078595.13..100078595.38 rows=100 width=22) Sort Key: a.user_id -> Nested Loop (cost=100078571.91..100078591.81 rows=100 width=22) Join Filter: (("inner".attendance_date <= "outer".end_date) AND ("inner".attendance_date >= "outer".start_date)) -> Hash Join (cost=1.45..3.35 rows=36 width=16) Hash Cond: ("outer".top_term_id = "inner".term_id) -> Seq Scan on "Terms" t (cost=0.00..1.36 rows=36 width=16) -> Hash (cost=1.36..1.36 rows=36 width=8) -> Seq Scan on "Terms" ytd (cost=0.00..1.36 rows=36 width=8) -> Materialize (cost=78570.46..78570.66 rows=20 width=14) -> Hash Join (cost=991.91..78570.44 rows=20 width=14) Hash Cond: ("outer".school_id = "inner".school_id) -> Hash Join (cost=990.87..78569.02 rows=25 width=18) Hash Cond: (("outer".time_slot = "inner".course_id) AND ("outer".user_id = "inner".user_id)) -> Seq Scan on "Attendance" a (cost=0.00..75599.26 rows=79148 width=14) Filter: ((attendance_type = 1) AND ((attendance_status = 3) OR (attendance_status = 4) OR (attendance_status = 1) OR (attendance_status = 2)) AND (attendance_date <= now())) -> Hash (cost=986.69..986.69 rows=836 width=16) -> Hash Join (cost=272.26..986.69 rows=836 width=16) Hash Cond: ("outer".course_offered_id = "inner".course_offered_id) -> Hash Join (cost=246.81..948.70 rows=836 width=16) Hash Cond: ("outer".course_id = "inner".course_id) -> Seq Scan on "CoursesUsers" cu (cost=0.00..545.02 rows=29702 width=8) -> Hash (cost=246.68..246.68 rows=49 width=8) -> Seq Scan on "Courses" c (cost=0.00..246.68 rows=49 width=8) Filter: ((course_id = 221) OR (course_id = 395) OR (course_id = 244) OR (course_id = 394) OR (course_id = 366) OR (course_id = 370) OR (course_id = 400) OR (course_id = 11) OR (course_id = 373) (..) -> Hash (cost=23.36..23.36 rows=836 width=8) -> Seq Scan on "CoursesOffered" co (cost=0.00..23.36 rows=836 width=8) -> Hash (cost=1.04..1.04 rows=3 width=4) -> Seq Scan on "Schools" s (cost=0.00..1.04 rows=3 width=4) Filter: (district_id = 2) **************************************************************************** This email may contain confidential material. If you were not an intended recipient, Please notify the sender and delete all copies. We may monitor email to and from our network. ****************************************************************************
"Peterson, Bjorn" <Bjorn.Peterson@pearson.com> writes: >> That's what it's there for ... but it would be useful to look into why >> the planner gets it so wrong without that hint. Could we see EXPLAIN >> ANALYZE both ways? > Below is my query and the output of EXPLAIN - I was not able to run EXPLAIN > ANALYZE, as the query never completes unless we turn enable_nestloop off: Well, when the point is to find out why the planner's estimates don't match reality, it's difficult to learn anything by looking only at the estimates and not at reality. Given what you say about the table sizes, the planner's preferred plan looks somewhat reasonable. I think the weak spot is the assumption that this index check will be fast: > -> Index Scan using "Attendance_pkey" on > "Attendance" a (cost=0.00..4.37 rows=1 width=14) > Index Cond: ((a.attendance_date <= now()) > AND (a.attendance_type = 1) AND ("outer".course_id = a.time_slot) AND > (a.user_id = "outer".user_id)) and the reason this seems like a weak spot is that the plan implies that you made attendance_date be the first column in the index. At least for this query, it'd be far better for attendance_date to be the last column, so that the info for any one user_id is bunched together in the index. For that matter I'd bet that attendance_type shouldn't be the highest part of the key either --- either course_id or user_id should probably be the leading key, depending on what sorts of queries you do. It wouldn't matter for this query, but you should look to see if you have other queries that select on only one of the two. If you have both equalities and inequalities in an index condition, you always want the equalities to be on the higher-order keys. Otherwise the scan will involve wasted scanning over index entries that match only some of the conditions. (Think about the ordering of a multicolumn index to see why this is so.) In this particular case I think the thing will be scanning almost the whole index every time :-( regards, tom lane