Обсуждение: Index not being used in sorting of simple table
This is in Postgres 8.1.5 I have a table like CREATE TABLE x (a VARCHAR, b VARCHAR, c VARCHAR); CREATE INDEX y on x(a); CREATE INDEX z on x(b); There are over a million rows in 'x'. Neither a nor b are unique. There are probably about 20 or so distinct values of a and 30 or so distinct values of b I've done a 'vacuum analyze' first. If I do EXPLAIN SELECT * FROM x ORDER BY a; it says Index Scan using y on x (cost=0.00..2903824.15 rows=1508057 width=152) That's what I'd expect However, if I do EXPLAIN SELECT * FROM x ORDER BY b; it says Sort (cost=711557.34..715327.48 rows=1508057 width=152) Sort Key: b -> Seq Scan on x (cost=0.00..53203.57 rows=1508057 width=152) Why doesn't it use the other index? If use 'set enable_seqscan=0' then it does. I tried using EXPLAIN ANALYZE to see how long it actually took: - seq scan - 75 secs - index scan - 13 secs - seq scan - 77 secs (I tried the seq scan version after the index scan as well to see if disk caching was a factor, but it doesn't look like it) If I do something like SELECT * FROM x WHERE b='...'; then it does use the index , it's just for ordering it doesn't seem to. (Yes, it's a BTREE index, not a hash index) Oh, and if I use EXPLAIN SELECT * FROM x ORDER BY b LIMIT 100000; then it uses the index scan, not the seq scan. If I use EXPLAIN SELECT * FROM x ORDER BY b LIMIT 1000000; it uses the seq scan again, so I can't just set an arbitrarily big limit to use the index. Any ideas? To me it looks like a bug in the planner. I can't think of any logical reason not to use an existing index to retrieve a sorted listing of the data. Paul VPOP3 - Internet Email Server/Gateway support@pscs.co.uk http://www.pscs.co.uk/
Paul Smith wrote: > Why doesn't it use the other index? If use 'set enable_seqscan=0' then > it does. Just a guess, but is the table clustered on column a? Maybe not explicitly, but was it loaded from data that was sorted by a? Analyzer calculates the correlation between physical order and each column. The planner will favor index scans instead of sorting when the correlation is strong, and it thinks the data doesn't fit in memory. Otherwise an explicitly sort will result in less I/O and be therefore more favorable. You can check the correlation stats with: SELECT tablename, attname, correlation FROM pg_stats where tablename='x'; > I tried using EXPLAIN ANALYZE to see how long it actually took: > - seq scan - 75 secs > - index scan - 13 secs > - seq scan - 77 secs > (I tried the seq scan version after the index scan as well to see if > disk caching was a factor, but it doesn't look like it) That won't flush the heap pages from cache... How much memory do you have and how large is the table? I suspect that the planner thinks it doesn't fit in memory, and therefore favors the seqscan+sort plan which would require less random I/O, but in reality it's in cache and the index scan is faster because it doesn't need to sort. Have you set your effective_cache_size properly? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Paul Smith <paullocal@pscs.co.uk> writes: > If I do > EXPLAIN SELECT * FROM x ORDER BY a; > it says > Index Scan using y on x (cost=0.00..2903824.15 rows=1508057 width=152) > That's what I'd expect > However, if I do > EXPLAIN SELECT * FROM x ORDER BY b; > it says > Sort (cost=711557.34..715327.48 rows=1508057 > width=152) > Sort Key: > b > -> Seq Scan on x (cost=0.00..53203.57 rows=1508057 width=152) > Why doesn't it use the other index? You have the question backwards: given those cost estimates, I'd wonder why it doesn't do a sort in both cases. Offhand I think the sort cost estimate is pretty much independent of the data itself, so it should have come out with a cost near 715327 for sorting on A, so why's it using an indexscan that costs more than 4x as much? The indexscan cost estimate varies quite a bit depending on the estimated correlation (physical ordering) of the column, so seeing it do different things in the two cases isn't surprising in itself. But I think there's some relevant factor you've left out of the example. As for getting the estimates more in line with reality, you probably need to play with random_page_cost and/or effective_cache_size. > Any ideas? To me it looks like a bug in the planner. I can't think of > any logical reason not to use an existing index to retrieve a sorted > listing of the data. Sorry, but using a forced sort frequently *is* faster than a full-table indexscan. It all depends on how much locality of reference there is, ie how well the index order and physical table order match up. The planner's statistical correlation estimate and cost parameters may be far enough off to make it pick the wrong choice, but it's not a bug that it considers the options. regards, tom lane
At 16:26 04/05/2007, you wrote: >Paul Smith wrote: >>Why doesn't it use the other index? If use 'set enable_seqscan=0' >>then it does. > >Just a guess, but is the table clustered on column a? Maybe not >explicitly, but was it loaded from data that was sorted by a? I wouldn't have thought so - a is pretty 'random' as far as order of insertion goes. On the other hand 'b' (the one whose index doesn't get used) is probably pretty correlated - 'b' is the date when the entry was added to the table, so they would be added in order of 'b' (they also get deleted after a while, and I'm not sure how PGSQL re-uses deleted rows that have been vacuumed) >Analyzer calculates the correlation between physical order and each >column. The planner will favor index scans instead of sorting when >the correlation is strong, and it thinks the data doesn't fit in >memory. Otherwise an explicitly sort will result in less I/O and be >therefore more favorable. Ah, I see. >You can check the correlation stats with: >SELECT tablename, attname, correlation FROM pg_stats where tablename='x'; There I get x | a | 0.977819 x | b | 0.78292 This is a bit odd, because I'd have thought they'd be more correlated on 'b' than 'a'.. >>I tried using EXPLAIN ANALYZE to see how long it actually took: >>- seq scan - 75 secs >>- index scan - 13 secs >>- seq scan - 77 secs > >>(I tried the seq scan version after the index scan as well to see >>if disk caching was a factor, but it doesn't look like it) > >That won't flush the heap pages from cache... No, I know, but it would mean that if the pages were being loaded into disk cache by the first scan which would make the second scan quicker, it would probably make the third one quicker as well. >How much memory do you have and how large is the table? The table is about 300MB. I have 2GB RAM on my PC (but most of it is in use - the disk cache size is currently 600MB). >I suspect that the planner thinks it doesn't fit in memory, and >therefore favors the seqscan+sort plan which would require less random I/O, >but in reality it's in cache and the index scan is faster because it >doesn't need to sort. Have you set your effective_cache_size properly? I haven't set that at all - it's the default.. If I set this to 51200 (I think that means 400MB) then it does use the index scan method, so thanks for this bit of info. Paul VPOP3 - Internet Email Server/Gateway support@pscs.co.uk http://www.pscs.co.uk/
Hi,
Paul:
Quite like Tom, I too think that its the first query that is more intriguing than the second one. (The expected cost for the indexscan (A) query is 4x the expected time for the 'Sequential Scan' (B) query !!)
Could you provide with the (complete output of) EXPLAIN ANALYZE times for both of these queries ? That would tell how much time it actually took as compared to the expected times.
Tom:
There is one thing though, that I couldn't really understand. Considering that A's correlation in pg_stats being very high compared to B, isn't it 'a better candidate' for a sequential scan as compared to B in this scenario ? Or is it the other way around ?
Regards,
Robins Tharakan
--
Robins
Paul:
Quite like Tom, I too think that its the first query that is more intriguing than the second one. (The expected cost for the indexscan (A) query is 4x the expected time for the 'Sequential Scan' (B) query !!)
Could you provide with the (complete output of) EXPLAIN ANALYZE times for both of these queries ? That would tell how much time it actually took as compared to the expected times.
Tom:
There is one thing though, that I couldn't really understand. Considering that A's correlation in pg_stats being very high compared to B, isn't it 'a better candidate' for a sequential scan as compared to B in this scenario ? Or is it the other way around ?
Regards,
Robins Tharakan
On 5/4/07, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Paul Smith <paullocal@pscs.co.uk> writes:
> If I do
> EXPLAIN SELECT * FROM x ORDER BY a;
> it says
> Index Scan using y on x (cost=0.00..2903824.15 rows=1508057 width=152)
> That's what I'd expect
> However, if I do
> EXPLAIN SELECT * FROM x ORDER BY b;
> it says
> Sort (cost=711557.34..715327.48 rows=1508057
> width=152)
> Sort Key:
> b
> -> Seq Scan on x (cost=0.00..53203.57 rows=1508057 width=152)
> Why doesn't it use the other index?
You have the question backwards: given those cost estimates, I'd wonder
why it doesn't do a sort in both cases. Offhand I think the sort cost
estimate is pretty much independent of the data itself, so it should
have come out with a cost near 715327 for sorting on A, so why's it
using an indexscan that costs more than 4x as much?
The indexscan cost estimate varies quite a bit depending on the
estimated correlation (physical ordering) of the column, so seeing
it do different things in the two cases isn't surprising in itself.
But I think there's some relevant factor you've left out of the example.
As for getting the estimates more in line with reality, you probably
need to play with random_page_cost and/or effective_cache_size.
> Any ideas? To me it looks like a bug in the planner. I can't think of
> any logical reason not to use an existing index to retrieve a sorted
> listing of the data.
Sorry, but using a forced sort frequently *is* faster than a full-table
indexscan. It all depends on how much locality of reference there is,
ie how well the index order and physical table order match up. The
planner's statistical correlation estimate and cost parameters may be
far enough off to make it pick the wrong choice, but it's not a bug that
it considers the options.
regards, tom lane
---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
--
Robins
Robins <tharakan@gmail.com> writes: > There is one thing though, that I couldn't really understand. Considering > that A's correlation in pg_stats being very high compared to B, isn't it 'a > better candidate' for a sequential scan as compared to B in this scenario ? No, high correlation reduces the cost of an indexscan but doesn't do anything much for a seqscan-and-sort. (Actually, I suppose it could help by reducing the number of initial runs to be merged, but that's not an effect the planner knows about.) The interesting point is that Paul shows SELECT tablename, attname, correlation FROM pg_stats where tablename='x'; x | a | 0.977819 x | b | 0.78292 when his initial verbal description indicated that b should have the better correlation. So that's something else odd about this case. regards, tom lane