Re: Postgresql selecting strange index for simple query
От | Maxim Boguk |
---|---|
Тема | Re: Postgresql selecting strange index for simple query |
Дата | |
Msg-id | 49A6E44A.8040106@masterhost.ru обсуждение исходный текст |
Ответ на | Re: Postgresql selecting strange index for simple query (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Postgresql selecting strange index for simple query
(Tom Lane <tgl@sss.pgh.pa.us>)
|
Список | pgsql-general |
> Maxim Boguk <mboguk@masterhost.ru> writes: >> Tom Lane wrote: >>> It does know better than that. I'm wondering if the single-column index >>> has become very bloated or something. Have you compared the physical >>> index sizes? > >> Table fresh loaded from dump on test server... So no index bloat for sure... >> As for comparing physical sizes, right single column index indeed smaller then wrong one: > > Huh. I get sane-looking choices when I try a similar case here. Can > you put together a self-contained test case? Not full self-contained test case but some minimal setup data which can give you ideas whats going wrong: Test confirm my theory wrong index selection linked with long rows in table. My tests contain such queries: Ok TEST1: set random_page_cost=1; drop TABLE if exists test_table ; SELECT user_id,last_change_time,rpad('a',1,'b') as f1 into test_table from resume; ANALYZE test_table; SELECT count(*) from test_table; CREATE INDEX right_idx on test_table(last_change_time); CREATE INDEX wrong_idx on test_table(user_id, last_change_time); EXPLAIN ANALYZE SELECT 1 from test_table where last_change_time > '2009-01-10 00:00:00'; result: Index Scan using right_idx on test_table (cost=0.00..42763.35 rows=388718 width=0) (actual time=0.020..342.653 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 368.699 ms Ok TEST2 (but see: cost increased 4x times when real work time increased only by 30%): same but rpad('a',200,'b') instead of rpad('a',1,'b') when populating test_table: result: Index Scan using right_idx on test_table (cost=0.00..179346.09 rows=392268 width=0) (actual time=0.089..422.439 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 448.717 ms !!Not ok TEST3:!! same but rpad('a',500,'b') instead of rpad('a',1,'b') when populating test_table: Oops wrong index used: Index Scan using wrong_idx on test_table (cost=0.00..254918.19 rows=392231 width=0) (actual time=0.067..730.097 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 757.930 ms (3 rows) (btw if drop wrong_idx query become works almost 2х faster: drop INDEX wrong_idx; Index Scan using right_idx on test_table (cost=0.00..259709.09 rows=392231 width=0) (actual time=0.019..416.108 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 442.790 ms ) Again Ok TEST4 (here toast engine removed all long values from test_table): same but rpad('a',2000,'b') instead of rpad('a',1,'b') when populating test_table: Index Scan using right_idx on test_table (cost=0.00..64606.50 rows=393002 width=0) (actual time=0.058..371.723 rows=390370loops=1) Index Cond: (last_change_time > '2009-01-10 00:00:00'::timestamp without time zone) Total runtime: 397.929 ms So i have two theory (just waving hands ofcourse): 1)integer owerflow somewhere in cost calculation 2)floating rounding errors (because cost very close in wrong situations: 254918.19 vs 259709.09) PS: second issue is cost increasing with increasing lenght of rows in table complete indepenent with increase real work time. (probably i need tune other _cost parameters for get more exact cost values) PPS: sorry for my poor english Regards, Maxim Boguk
В списке pgsql-general по дате отправления:
Следующее
От: Madison KellyДата:
Сообщение: Returning null for joined tables when one column non existant