Re: Index of a table is not used (in any case)

Поиск
Список
Период
Сортировка
От Zeugswetter Andreas SB SD
Тема Re: Index of a table is not used (in any case)
Дата
Msg-id 46C15C39FEB2C44BA555E356FBCD6FA41EB3D9@m0114.s-mxs.net
обсуждение исходный текст
Ответ на Index of a table is not used (in any case)  (Reiner Dassing <dassing@wettzell.ifag.de>)
Список pgsql-hackers
Tom Lane writes:
> "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> > Imho one of the biggest sources for problems is people creating new
> > indexes on populated tables when the rest of the db/table has badly
> > outdated statistics or even only default statistics in place.
> > In this situation the optimizer is badly misguided, because it now
> > sees completely inconsistent statistics to work on.
> > (e.g. old indexes on that table may seem way too cheap compared 
> > to table scan) 
> 
> I don't think any of this is correct.  We don't have per-index
> statistics.  The only stats updated by CREATE INDEX are the same ones
> updated by plain VACUUM, viz the number-of-tuples and number-of-pages
> counts in pg_class.

1. Have I said anything about other stats, than relpages and reltuples ?

2. There is only limited use in the most accurate pg_statistics if
reltuples
and relpages is completely off. In the current behavior you eg get:

rel1: pages = 100000    -- updated from "create index"
index1 pages = 2        -- outdated
index2 pages = 2000    -- current

rel2: pages = 1        -- outdated

--> Optimizer will prefer join order: rel2, rel1

> I believe it's reasonable to update those stats
> more often than the pg_statistic stats (in fact, if we could keep them
> constantly up-to-date at a reasonable cost, we'd do so).

There is a whole lot of difference between keeping them constantly up to

date and modifying (part of) them in the "create index" command, so I do

not counter your above sentence, but imho the conclusion is wrong.

> The
> pg_statistic stats are designed as much as possible to be independent
> of the absolute number of rows in the table, so that it's okay if they
> are out of sync with the pg_class stats.

Independently, they can only be good for choosing whether to use an 
index or seq scan. They are not sufficient to choose a good join order.

> The major reason why "you vacuumed but you never analyzed" is such a
> killer is that in the absence of any pg_statistic data, the default
> selectivity estimates are such that you may get either an index or seq
> scan depending on how big the table is.  The cost estimates are
> nonlinear (correctly so, IMHO, though I wouldn't necessarily 
> defend the
> exact shape of the curve) and ye olde default 0.01 will give you an
> indexscan for a small table but not for a big one.  In 7.2 I have
> reduced the default selectivity estimate to 0.005, for a number of
> reasons but mostly to get it out of the range where the decision will
> flip-flop.

Yes, the new selectivity is better, imho even still too high.
Imho the strategy should be to assume a good selectivity
of values in absence of pg_statistics evidence.
If the index was not selective enough for an average query, the
dba should not have created the index in the first place.

> test71=# create table foo (f1 int);
> test71=# create index fooi on foo(f1);
> test71=# explain select * from foo where f1 = 42;

> Index Scan using fooi on foo  (cost=0.00..8.14 rows=10 width=4)

> test71=# update pg_class set reltuples = 100000, relpages = 
> 1000 where relname = 'foo';
> Index Scan using fooi on foo  (cost=0.00..1399.04 rows=1000 width=4)

> test71=# update pg_class set reltuples = 1000000, relpages = 
> 10000 where relname = 'foo';

> Seq Scan on foo  (cost=0.00..22500.00 rows=10000 width=4)

> In current sources you keep getting an indexscan as you increase the
> number of tuples...

As you can see it toppeled at 10 Mio rows :-(

Andreas


В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Nicolas Verger"
Дата:
Сообщение: Re: Index not used ! Why? + Little graphical client ...
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Can't cast bigint to smallint?