Обсуждение: The standard 'why does it take so long' question
I have this query which used to yield this EXPLAIN ANALYZE [reformated] output: Sort (cost=12830.25..12830.25 rows=391 width=26) (actual time=39966.50..40305.70 rows=26982 loops=1) -> Nested Loop (cost=0.00..12813.40 rows=391 width=26) (actual time=69.16..34152.78 rows=26982 loops=1) -> Index Scan using chat_user_lcasename_idx , chat_user_lcasename_idx on chat_user u (cost=0.00..7.21 rows=2 width=12) (actual time=1.06..65.62 rows=2 loops=1) -> Index Scan using chat_post_usertime_idx on chat_post p (cost=0.00..6383.72 rows=1638 width=14) (actual time=72.31..10764.40 rows=13491 loops=2) Total runtime: 40739.10 msec where chat_post_usertime_idx is from: CREATE INDEX blah blah ... btree(user_id,time) After adding another index chat_post_time_user_idx, which reverses the order of the two columns in the above index, EXPLAIN ANALYZE yields: Sort (cost=9677.13..9677.13 rows=391 width=26) (actual time=62267.30..62554.49 rows=26982 loops=1) -> Nested Loop (cost=0.00..9660.28 rows=391 width=26) (actual time=174.77..55699.39 rows=26982 loops=1) -> Index Scan using chat_user_lcasename_idx , chat_user_lcasename_idx on chat_user u (cost=0.00..7.21 rows=2 width=12) (actual time=37.82..97.44 rows=2 loops=1) -> Index Scan using chat_post_timeuser_idx on chat_post p (cost=0.00..4806.89 rows=1638 width=14) (actual time=328.81..19932.71 rows=13491 loops=2) Total runtime: 62983.64 msec A 50% performance loss. I'm just trying to understand where the extra time comes from. Rearranging these explains into pairs of lines such that the first line of a pair is from the first explain and the second line from the second (works best in a _wide_ window) we can more easily compare things. Sort (cost=12830.25..12830.25 rows=391 width=26) (actual time=39966.50..40305.70 rows=26982 loops=1) Sort (cost=9677.13..9677.13 rows=391 width=26) (actual time=62267.30..62554.49 rows=26982 loops=1) -> Nested Loop (cost=0.00..12813.40 rows=391 width=26) (actual time=69.16..34152.78 rows=26982 loops=1) -> Nested Loop (cost=0.00..9660.28 rows=391 width=26) (actual time=174.77..55699.39 rows=26982 loops=1) -> Index Scan using chat_user_lcasename_idx, chat_user_lcasename_idx on chat_user u (cost=0.00..7.21 rows=2 width=12)(actual time=1.06..65.62 rows=2 loops=1) -> Index Scan using chat_user_lcasename_idx, chat_user_lcasename_idx on chat_user u (cost=0.00..7.21 rows=2 width=12)(actual time=37.82..97.44 rows=2 loops=1) -> Index Scan using chat_post_usertime_idx on chat_post p (cost=0.00..6383.72 rows=1638 width=14) (actual time=72.31..10764.40rows=13491 loops=2) -> Index Scan using chat_post_timeuser_idx on chat_post p (cost=0.00..4806.89 rows=1638 width=14) (actual time=328.81..19932.71rows=13491 loops=2) Total runtime: 40739.10 msec Total runtime: 62983.64 msec It's pretty obvious from this that the index scan on chat_user takes longer when the new index is used on the other table. The index scan of the new index obviously takes longer as well. The first question is why would the index scan on chat_user take significantly longer than before? All I could think of is that the use of the new index has forced more disk reads. Does that sound reasonable and what are the alternatives? The next question is why would the index scan of the new index take significantly longer than before? From pg_class the new index uses 3814 pages and the preexisting one 4191. Therefore, one would have thought that the scan of the new index would take less time since less disk reads...uh oh, I think I may have just thought this out clearer...the new index takes longer to scan because it presumbly has to read more pages of the index to cover the required time interval than the older index because the older index has already narrowed the selection down by the user. Does that sound reasonable? That would then support my supposition about the first question, right? My ignorance will show once more now; why would the index scan on chat_user be impacted by a loss of index pages from shared memory/disk cache? Isn't this scan finished before the one on chat_post starts? Or would that scenario have meant a hash join would be in the explain output? This would go some way to explaining the extra time taken on chat_post if I'm missunderstanding this. I think I'll look up the explain output in the manual again as well. Oh, and before it's asked, the DB has been analyzed after the last data load and there was no change in data between these runs. Also: version ---------------------------------------------------------------------- PostgreSQL 7.2.1 on i386-unknown-freebsd3.3, compiled by GCC 2.7.2.3 I hope I haven't put y'all to sleep. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes: > I have this query which used to yield this EXPLAIN ANALYZE > [reformated] output: Uh, *which* query, exactly? Cheers, Neil -- Neil Conway <neilconway@rogers.com> PGP Key ID: DB3C29FC
On 8 Aug 2002, Neil Conway wrote: > "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes: > > I have this query which used to yield this EXPLAIN ANALYZE > > [reformated] output: > > Uh, *which* query, exactly? Hmmm...I suppose I should have written 'I have a query...' Never mind I'll repost the whole message along with the query, just so that it's all together for easier reference. The query refered to in the post below: explain analyze select u.name , p.session_id , p.post_number , to_char(p.time,'Dy, Mon DD YYYY HH:MIam ET') from chat_post p , chat_user u where p.poster_id = u.id AND p.time >= 'epoch'::timestamptz + '959904000 seconds'::interval and p.time <= 'epoch'::timestamptz + '1023667200 seconds'::interval and ( u.lower_name = 'thrifty' OR u.lower_name = 'hope1' ) order by p.time ---
Nigel J. Andrews wrote: > The query refered to in the post below: > > explain analyze > select u.name > , p.session_id > , p.post_number > , to_char(p.time,'Dy, Mon DD YYYY HH:MIam ET') > from chat_post p > , chat_user u > where > p.poster_id = u.id > AND > p.time >= 'epoch'::timestamptz + '959904000 seconds'::interval > and > p.time <= 'epoch'::timestamptz + '1023667200 seconds'::interval > and > ( u.lower_name = 'thrifty' OR u.lower_name = 'hope1' ) > order by p.time > I'm not sure I can answer your detailed questions, but if you're really searching for a way to make this query run faster, it strikes me that I didn't see any evidence of an index on p.poster_id. It looks to me like that would speed this query significantly. For example, I just experimented here with a header/detail query similar to yours, and found that without an index on the foreign key field, the time for the query increased by 1000X (1.74 msec to 1771.95 msec). HTH, Joe
"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes: > The first question is why would the index scan on chat_user take > significantly longer than before? I think the indexscan is having to skip over more irrelevant data in the second case. Think about it: you can scan an index range consisting of a single user's posts between times T1 and T2, or you can scan an index range consisting of all posts between times T1 and T2. The second column of the index will save you from actually going to the heap for posts from other users, but you'll still have to pass over those index entries, because the contiguous range of index entries that covers the data you want will include a a lot of posts from other users. What interests me is why the planner chose the second index when it had a choice; I'd have thought its cost models were good enough to handle this subtlety. Can you post the pg_stats rows for the columns in question? regards, tom lane
First I'll just say thanks to Joe Conway for the suggestion of an index on the poster_id column. Although I didn't show it I do have an index on that field as well, however, as I understand it the index on (poster_id,time) should fulfil this same function. On Fri, 9 Aug 2002, Tom Lane wrote: > "Nigel J. Andrews" <nandrews@investsystems.co.uk> writes: > > The first question is why would the index scan on chat_user take > > significantly longer than before? > > I think the indexscan is having to skip over more irrelevant data in the > second case. Think about it: you can scan an index range consisting of > a single user's posts between times T1 and T2, or you can scan an index > range consisting of all posts between times T1 and T2. The second > column of the index will save you from actually going to the heap for > posts from other users, but you'll still have to pass over those index > entries, because the contiguous range of index entries that covers the > data you want will include a a lot of posts from other users. That was the idea that popped into my head while I was writing the questions so that certainly seems reasonable to me. > > What interests me is why the planner chose the second index when it > had a choice; I'd have thought its cost models were good enough to > handle this subtlety. Can you post the pg_stats rows for the columns > in question? I really don't know sufficient to be able to see why the stats would favour one index over the other. Although looking at the pg_stats entries below now I notice that the correlation for the time column is 1, compared to 0.058 for the poster_id. As there is a natural order to the data, i.e. it's loaded in time order, I am not entirely surprised that there's a high correlation, just slightly surprised the stats gatherer picked up on it. Note that the poster_ids for the example query are also shown below and are not listed in the most_common_vals array. I've also shown the human understandable values for the time limits in the query and due to the nature/distribution of this column it's unsurprising to find that I managed to pick a time period that didn't include one of the more common times. So, what's making the planner choose the slower index is the correlation figure right? archive=> select * from pg_stats where attname = 'time' or attname = 'poster_id' and tablename = 'chat_post'; tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation -----------+-----------+-----------+-----------+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------- chat_post | poster_id | 0 | 2 | 338 | {2149,731,1130,2595,1879,1656,688,1842,2473,1747} | {0.0603333,0.0306667,0.0306667,0.03,0.0286667,0.0276667,0.0273333,0.025,0.02,0.0196667}| {4,252,521,819,1112,1423,1639,1820,2096,2427,2879} | 0.0583958 chat_post | time | 0 | 8 | -0.350654 | {"1998-03-26 11:09:00-05","1999-05-13 00:49:00-04","2000-12-1413:05:00-05","2001-01-25 12:18:00-05","2002-04-04 11:37:00-05","2002-05-28 15:21:00-04","2002-07-1009:53:00-04"} | {0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667,0.000666667} | {"1998-03-06 01:26:00-05","1998-08-2111:00:00-04","1999-02-22 15:15:00-05","1999-07-22 05:37:00-04","2000-01-24 09:53:00-05","2000-10-2317:45:00-04","2001-02-26 16:22:00-05","2001-06-28 20:19:00-04","2001-12-06 14:06:00-05","2002-04-2616:05:00-04","2002-08-09 00:10:00-04"} | 1 (2 rows) archive=> select * from chat_user where name in ('thrifty', 'hope1'); id | name | lower_name ------+---------+------------ 2494 | thrifty | thrifty 1112 | hope1 | hope1 (2 rows) archive=> select 'epoch'::timestamptz + '959904000 seconds'::interval as from, 'epoch'::timestamptz + '1023667200 seconds'::intervalas to; from | to ------------------------+------------------------ 2000-06-02 01:00:00+01 | 2002-06-10 01:00:00+01 (1 row) Thanks for the input Tom. -- Nigel J. Andrews Director --- Logictree Systems Limited Computer Consultants
"Nigel J. Andrews" <nandrews@investsystems.co.uk> writes: > I really don't know sufficient to be able to see why the stats would > favour one index over the other. Although looking at the pg_stats > entries below now I notice that the correlation for the time column is > 1, compared to 0.058 for the poster_id. Ah, of course, that would do it. Thinking about it, I see that the system is not really very bright about ordering considerations for multicolumn indexes. On a macro scale, the posterid/time index is poorly correlated with the physical table order --- but when you consider only the set of entries for a single posterid over a small time range, the index is pretty well correlated. The planner doesn't consider that effect, so it mistakenly credits the time/posterid index with much higher correlation and hence lower scan cost than the other. I already had a todo item to reconsider the costing estimates for multicolumn indexes --- will see if we can be smarter about cases like this. regards, tom lane