Re: Are statistics gathered on function indexes?
От | Tom Lane |
---|---|
Тема | Re: Are statistics gathered on function indexes? |
Дата | |
Msg-id | 3612.1025272862@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Are statistics gathered on function indexes? (Ray Ontko <rayo@ontko.com>) |
Ответы |
Re: Are statistics gathered on function indexes?
(Ray Ontko <rayo@ontko.com>)
Re: Are statistics gathered on function indexes? (Bruce Momjian <pgman@candle.pha.pa.us>) |
Список | pgsql-admin |
Ray Ontko <rayo@ontko.com> writes: >> It's there already; what did you think was making the difference >> between W% and WI% ? > Yes, but the cost doesn't continue to decline if I make the LIKE > more and more restrictive by going from WI% to WIL% to WILL%, etc. Yes it does, if you have a large enough table. In most scenarios the selectivity drops off fast enough with larger strings that you hit the minimum estimate of 1 row pretty quickly; I suppose that's what's happening with your case. Here's an example using the 7.2 regression-test database: -- update stats regression=# analyze road; ANALYZE -- now force planner to think "road" is much larger than it really is, -- else we can't see the change in estimate beyond WI% regression=# update pg_class set relpages = relpages * 10000, regression-# reltuples = reltuples * 10000 where relname = 'road'; UPDATE 1 regression=# explain select * from only road where name like 'W%'; NOTICE: QUERY PLAN: Seq Scan on road (cost=0.00..1444625.00 rows=764903 width=89) EXPLAIN regression=# explain select * from only road where name like 'WI%'; NOTICE: QUERY PLAN: Index Scan using rix on road (cost=0.00..25007.80 rows=8406 width=89) EXPLAIN regression=# explain select * from only road where name like 'WIJ%'; NOTICE: QUERY PLAN: Index Scan using rix on road (cost=0.00..277.04 rows=92 width=89) EXPLAIN regression=# explain select * from only road where name like 'WIJK%'; NOTICE: QUERY PLAN: Index Scan using rix on road (cost=0.00..5.28 rows=1 width=89) EXPLAIN regression=# explain select * from only road where name like 'WIJKL%'; NOTICE: QUERY PLAN: Index Scan using rix on road (cost=0.00..5.23 rows=1 width=89) EXPLAIN regression=# explain select * from only road where name like 'WIJKLM%'; NOTICE: QUERY PLAN: Index Scan using rix on road (cost=0.00..5.23 rows=1 width=89) EXPLAIN regression=# As you can see, the estimate drops off by about a factor of 90 per added character. This is probably too much, but it's not that easy to determine what the ratio ought to be. The critical code involved in this is convert_string_to_scalar in backend/utils/adt/selfuncs.c; the ratio per character is essentially the same as the character range that it induces from the available values. Feel free to propose a better implementation if you can think of one. regards, tom lane
В списке pgsql-admin по дате отправления: