Re: Postgresql selecting strange index for simple query
От | Maxim Boguk |
---|---|
Тема | Re: Postgresql selecting strange index for simple query |
Дата | |
Msg-id | 49A705B3.9080408@masterhost.ru обсуждение исходный текст |
Ответ на | Re: Postgresql selecting strange index for simple query (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Postgresql selecting strange index for simple query
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-general |
Tom Lane wrote: > Maxim Boguk <mboguk@masterhost.ru> writes: >> So i have two theory (just waving hands ofcourse): >> 1)integer owerflow somewhere in cost calculation > > Costs are floats, and in any case you're not showing costs anywhere near > the integer overflow limit... > >> 2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09) > > The planner is intentionally set up to consider costs within a percent > or so of each other as being effectively equal. If the estimated costs > are that close then it doesn't surprise me if it sometimes picks the > "wrong" plan. The real question is why are the estimates so close? > They should not be, since AFAICS you are talking about a situation > where we'd have to scan all of the multicol index versus only about > a fifth of the single-col one. Ok i exploring more: just one thing: hh=# SHOW default_statistics_target ; default_statistics_target --------------------------- 10 (1 row) (btw increase statistic to 1000 do not fix situation). I try simplify test case and: Now use sequential user_id, and truncate last_change_time to date: SELECT nextval('test_seq') as user_id,last_change_time::date ,rpad('a',500,'b') as f1 into test_table from resume; ANALYZE test_table; SELECT count(*) from test_table; CREATE INDEX right_idx on test_table(last_change_time); CREATE INDEX wrong_idx on test_table(user_id, last_change_time); EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00'; Index Scan using wrong_idx on test_table (cost=0.00..182623.51 rows=316522 width=0) (actual time=0.056..534.620 rows=382671loops=1) Index Cond: (last_change_time > '2009-01-10'::date) DROP INDEX wrong_idx; EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00'; Index Scan using right_idx on test_table (cost=0.00..221765.19 rows=316522 width=0) (actual time=0.023..346.213 rows=382671loops=1) Index Cond: (last_change_time > '2009-01-10'::date) Full index scan over wrong index cost reasonable lower then 1/17 of single column index (182623 vs 221765)! So just last_change_time still cannot be generated... but: hh=# SELECT count(distinct last_change_time) from test_table; count ------- 2133 (1 row) And statistic values for last_change_time is: hh=# SELECT * from pg_stats where tablename='test_table' and attname='last_change_time'; -[ RECORD 1 ]-----+--------------------------------------------------------------------------------------------------------------------------- schemaname | public tablename | test_table attname | last_change_time null_frac | 0 avg_width | 4 n_distinct | 1211 most_common_vals | {2004-02-27,2009-01-26,2008-03-11,2009-01-27,2004-01-15,2008-10-13,2009-01-19,2009-01-22,2009-01-21,2008-09-29} most_common_freqs | {0.00833333,0.00766667,0.00733333,0.007,0.00666667,0.00533333,0.00533333,0.00533333,0.005,0.00466667} histogram_bounds | {2003-03-31,2005-08-26,2006-10-04,2007-04-29,2007-09-27,2008-01-24,2008-05-01,2008-07-29,2008-10-01,2008-11-27,2009-01-25} correlation | 0.261512 I think it is all what planner can use when choose plan... because user_id is unique sequential values. regargs, Maxim Boguk
В списке pgsql-general по дате отправления: