Re: Postgresql selecting strange index for simple query

Поиск
Список
Период
Сортировка
От Maxim Boguk
Тема Re: Postgresql selecting strange index for simple query
Дата
Msg-id 49A705B3.9080408@masterhost.ru
обсуждение исходный текст
Ответ на Re: Postgresql selecting strange index for simple query  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Postgresql selecting strange index for simple query  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
Tom Lane wrote:
> Maxim Boguk <mboguk@masterhost.ru> writes:
>> So i have two theory (just waving hands ofcourse):
>> 1)integer owerflow somewhere in cost calculation
>
> Costs are floats, and in any case you're not showing costs anywhere near
> the integer overflow limit...
>
>> 2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09)
>
> The planner is intentionally set up to consider costs within a percent
> or so of each other as being effectively equal.  If the estimated costs
> are that close then it doesn't surprise me if it sometimes picks the
> "wrong" plan.  The real question is why are the estimates so close?
> They should not be, since AFAICS you are talking about a situation
> where we'd have to scan all of the multicol index versus only about
> a fifth of the single-col one.

Ok i exploring more:

just one thing:

hh=# SHOW default_statistics_target ;
  default_statistics_target
---------------------------
  10
(1 row)
(btw increase statistic to 1000 do not fix situation).

I try simplify test case and:
Now use sequential user_id, and truncate last_change_time to date:

SELECT nextval('test_seq') as user_id,last_change_time::date ,rpad('a',500,'b') as f1 into test_table from resume;
ANALYZE test_table;
SELECT count(*) from test_table;
CREATE INDEX right_idx on test_table(last_change_time);
CREATE INDEX wrong_idx on test_table(user_id, last_change_time);
EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00';

  Index Scan using wrong_idx on test_table  (cost=0.00..182623.51 rows=316522 width=0) (actual time=0.056..534.620
rows=382671loops=1) 
    Index Cond: (last_change_time > '2009-01-10'::date)

DROP INDEX wrong_idx;
EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00';

  Index Scan using right_idx on test_table  (cost=0.00..221765.19 rows=316522 width=0) (actual time=0.023..346.213
rows=382671loops=1) 
    Index Cond: (last_change_time > '2009-01-10'::date)

Full index scan over wrong index cost reasonable lower then 1/17 of single column index  (182623 vs 221765)!

So just last_change_time still cannot be generated... but:

hh=# SELECT count(distinct last_change_time) from test_table;
  count
-------
   2133
(1 row)

And statistic values for last_change_time is:

hh=# SELECT * from pg_stats where tablename='test_table' and attname='last_change_time';
-[ RECORD 1
]-----+---------------------------------------------------------------------------------------------------------------------------
schemaname        | public
tablename         | test_table
attname           | last_change_time
null_frac         | 0
avg_width         | 4
n_distinct        | 1211
most_common_vals  |
{2004-02-27,2009-01-26,2008-03-11,2009-01-27,2004-01-15,2008-10-13,2009-01-19,2009-01-22,2009-01-21,2008-09-29}
most_common_freqs |
{0.00833333,0.00766667,0.00733333,0.007,0.00666667,0.00533333,0.00533333,0.00533333,0.005,0.00466667}
histogram_bounds  |
{2003-03-31,2005-08-26,2006-10-04,2007-04-29,2007-09-27,2008-01-24,2008-05-01,2008-07-29,2008-10-01,2008-11-27,2009-01-25}
correlation       | 0.261512

I think it is all what planner can use when choose plan... because user_id is unique sequential values.

regargs, Maxim Boguk







В списке pgsql-general по дате отправления:

Предыдущее
От: Adrian Klaver
Дата:
Сообщение: Re: Connection refused (0x0000274D/10061).
Следующее
От: Tim Uckun
Дата:
Сообщение: Re: speaking of 8.4...