Re: estimating the need for VACUUM FULL and REINDEX

Поиск
Список
Период
Сортировка
От Guillaume Cottenceau
Тема Re: estimating the need for VACUUM FULL and REINDEX
Дата
Msg-id 87ejlql5ij.fsf@meuh.mnc.lan
обсуждение исходный текст
Ответ на Re: estimating the need for VACUUM FULL and REINDEX  (Heikki Linnakangas <heikki@enterprisedb.com>)
Список pgsql-performance
Heikki Linnakangas <heikki 'at' enterprisedb.com> writes:

> Guillaume Cottenceau wrote:
> > According to documentation[1], VACUUM FULL's only benefit is
> > returning unused disk space to the operating system; am I correct
> > in assuming there's also the benefit of optimizing the
> > performance of scans, because rows are physically compacted on
> > the disk?
>
> That's right.

Ok. Then I think the documentation should probably be updated? It
seems to totally miss this benefit.

We've been hit by degrading performance, probably because of too
seldom VACUUM ANALYZE, and in this situation it seems that the
two solutions are either VACUUM FULL or dumping and recreating
the database. Maybe this situation should be described in the
documentation. In this list, everyone always say "you should
VACUUM ANALYZE frequently" but little is done to consider the
case when we have to deal with an existing database on which this
hasn't been done properly.

> > With that in mind, I've tried to estimate how much benefit would
> > be brought by running VACUUM FULL, with the output of VACUUM
> > VERBOSE. However, it seems that for example the "removable rows"
> > reported by each VACUUM VERBOSE run is actually reused by VACUUM,
> > so is not what I'm looking for.
>
> Take a look at contrib/pgstattuple. If a table has high percentage of
> free space, VACUUM FULL will compact that out.

Thanks a lot. I've followed this path and I think it should be
said that free_space must also be large compared to 8K -
free_percent can be large for tables with very few tuples even on
already compacted tables.

> > Then according to documentation[2], REINDEX has some benefit when
> > all but a few index keys on a page have been deleted, because the
> > page remains allocated (thus, I assume it improves index scan
> > performance, am I correct?). However, again I'm unable to
> > estimate the expected benefit. With a slightly modified version
> > of a query found in documentation[3] to see the pages used by a
> > relation[4], I'm able to see that the index data from a given
> > table...
>
> See pgstatindex, in the same contrib-module. The number you're looking
> for is avg_leaf_density. REINDEX will bring that to 90% (with default
> fill factor), so if it's much lower than that REINDEX will help.

Woops, seems that this was not availabe in pgstattuple of pg 7.4 :/

--
Guillaume Cottenceau, MNC Mobile News Channel SA, an Alcatel-Lucent Company
Av. de la Gare 10, 1003 Lausanne, Switzerland - direct +41 21 317 50 36

В списке pgsql-performance по дате отправления:

Предыдущее
От: david@lang.hm
Дата:
Сообщение: Re: Best OS for Postgres 8.2
Следующее
От: "Steinar H. Gunderson"
Дата:
Сообщение: Re: Best OS for Postgres 8.2