Re: Postgresql selecting strange index for simple query

Поиск
Список
Период
Сортировка
От Maxim Boguk
Тема Re: Postgresql selecting strange index for simple query
Дата
Msg-id 49A6E44A.8040106@masterhost.ru
обсуждение исходный текст
Ответ на Re: Postgresql selecting strange index for simple query  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Postgresql selecting strange index for simple query  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-general
> Maxim Boguk <mboguk@masterhost.ru> writes:
>> Tom Lane wrote:
>>> It does know better than that.  I'm wondering if the single-column index
>>> has become very bloated or something.  Have you compared the physical
>>> index sizes?
>
>> Table fresh loaded from dump on test server... So no index bloat for sure...
>> As for comparing physical sizes, right single column index indeed smaller then wrong one:
>
> Huh.  I get sane-looking choices when I try a similar case here.  Can
> you put together a self-contained test case?

Not full self-contained test case but some minimal setup data which can give you ideas whats going wrong:
Test confirm my theory wrong index selection linked with long rows in table.

My tests contain such queries:

Ok TEST1:
set random_page_cost=1;
drop TABLE if exists test_table ;
SELECT user_id,last_change_time,rpad('a',1,'b') as f1 into test_table from resume;
ANALYZE test_table;
SELECT count(*) from test_table;
CREATE INDEX right_idx on test_table(last_change_time);
CREATE INDEX wrong_idx on test_table(user_id, last_change_time);
EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00';
result:
  Index Scan using right_idx on test_table  (cost=0.00..42763.35 rows=388718 width=0) (actual time=0.020..342.653
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 368.699 ms



Ok TEST2 (but see: cost increased 4x times when real work time increased only by 30%):
same but rpad('a',200,'b') instead of rpad('a',1,'b') when populating test_table:
result:
  Index Scan using right_idx on test_table  (cost=0.00..179346.09 rows=392268 width=0) (actual time=0.089..422.439
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 448.717 ms


!!Not ok TEST3:!!
same but rpad('a',500,'b') instead of rpad('a',1,'b') when populating test_table:
Oops wrong index used:
  Index Scan using wrong_idx on test_table  (cost=0.00..254918.19 rows=392231 width=0) (actual time=0.067..730.097
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 757.930 ms
(3 rows)
(btw if drop wrong_idx query become works almost 2х faster:
drop INDEX wrong_idx;
  Index Scan using right_idx on test_table  (cost=0.00..259709.09 rows=392231 width=0) (actual time=0.019..416.108
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 442.790 ms
)


Again Ok TEST4 (here toast engine removed all long values from test_table):
same but rpad('a',2000,'b') instead of rpad('a',1,'b') when populating test_table:
  Index Scan using right_idx on test_table  (cost=0.00..64606.50 rows=393002 width=0) (actual time=0.058..371.723
rows=390370loops=1) 
    Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone)
  Total runtime: 397.929 ms


So i have two theory (just waving hands ofcourse):
1)integer owerflow somewhere in cost calculation
2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09)

PS: second issue is cost increasing with increasing lenght of rows in table complete indepenent with increase real work
time.
(probably i need tune other _cost parameters for get more exact cost values)

PPS: sorry for my poor english

Regards, Maxim Boguk

В списке pgsql-general по дате отправления:

Предыдущее
От: Andrew Gould
Дата:
Сообщение: Re: Off Topic: ICD-10 codes in a database table?
Следующее
От: Madison Kelly
Дата:
Сообщение: Returning null for joined tables when one column non existant