Re: Index of a table is not used (in any case)
От | Zeugswetter Andreas SB SD |
---|---|
Тема | Re: Index of a table is not used (in any case) |
Дата | |
Msg-id | 46C15C39FEB2C44BA555E356FBCD6FA41EB3D9@m0114.s-mxs.net обсуждение исходный текст |
Ответ на | Index of a table is not used (in any case) (Reiner Dassing <dassing@wettzell.ifag.de>) |
Список | pgsql-hackers |
Tom Lane writes: > "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes: > > Imho one of the biggest sources for problems is people creating new > > indexes on populated tables when the rest of the db/table has badly > > outdated statistics or even only default statistics in place. > > In this situation the optimizer is badly misguided, because it now > > sees completely inconsistent statistics to work on. > > (e.g. old indexes on that table may seem way too cheap compared > > to table scan) > > I don't think any of this is correct. We don't have per-index > statistics. The only stats updated by CREATE INDEX are the same ones > updated by plain VACUUM, viz the number-of-tuples and number-of-pages > counts in pg_class. 1. Have I said anything about other stats, than relpages and reltuples ? 2. There is only limited use in the most accurate pg_statistics if reltuples and relpages is completely off. In the current behavior you eg get: rel1: pages = 100000 -- updated from "create index" index1 pages = 2 -- outdated index2 pages = 2000 -- current rel2: pages = 1 -- outdated --> Optimizer will prefer join order: rel2, rel1 > I believe it's reasonable to update those stats > more often than the pg_statistic stats (in fact, if we could keep them > constantly up-to-date at a reasonable cost, we'd do so). There is a whole lot of difference between keeping them constantly up to date and modifying (part of) them in the "create index" command, so I do not counter your above sentence, but imho the conclusion is wrong. > The > pg_statistic stats are designed as much as possible to be independent > of the absolute number of rows in the table, so that it's okay if they > are out of sync with the pg_class stats. Independently, they can only be good for choosing whether to use an index or seq scan. They are not sufficient to choose a good join order. > The major reason why "you vacuumed but you never analyzed" is such a > killer is that in the absence of any pg_statistic data, the default > selectivity estimates are such that you may get either an index or seq > scan depending on how big the table is. The cost estimates are > nonlinear (correctly so, IMHO, though I wouldn't necessarily > defend the > exact shape of the curve) and ye olde default 0.01 will give you an > indexscan for a small table but not for a big one. In 7.2 I have > reduced the default selectivity estimate to 0.005, for a number of > reasons but mostly to get it out of the range where the decision will > flip-flop. Yes, the new selectivity is better, imho even still too high. Imho the strategy should be to assume a good selectivity of values in absence of pg_statistics evidence. If the index was not selective enough for an average query, the dba should not have created the index in the first place. > test71=# create table foo (f1 int); > test71=# create index fooi on foo(f1); > test71=# explain select * from foo where f1 = 42; > Index Scan using fooi on foo (cost=0.00..8.14 rows=10 width=4) > test71=# update pg_class set reltuples = 100000, relpages = > 1000 where relname = 'foo'; > Index Scan using fooi on foo (cost=0.00..1399.04 rows=1000 width=4) > test71=# update pg_class set reltuples = 1000000, relpages = > 10000 where relname = 'foo'; > Seq Scan on foo (cost=0.00..22500.00 rows=10000 width=4) > In current sources you keep getting an indexscan as you increase the > number of tuples... As you can see it toppeled at 10 Mio rows :-( Andreas
В списке pgsql-hackers по дате отправления:
Предыдущее
От: "Nicolas Verger"Дата:
Сообщение: Re: Index not used ! Why? + Little graphical client ...