Re: Bloated tables and why is vacuum full the only option

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Bloated tables and why is vacuum full the only option
Дата
Msg-id 19897.1391974800@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Bloated tables and why is vacuum full the only option  (Claudio Freire <klaussfreire@gmail.com>)
Ответы Re: Bloated tables and why is vacuum full the only option
Список pgsql-performance
Claudio Freire <klaussfreire@gmail.com> writes:
>>> I also do routine reindexing to stop index bloat on its tracks, yet
>>> freshly-reindexed indexes get considerably reduced in size with vacuum
>>> full.

AFAIK there's no reason for vacuum full to produce a different result
from reindex.  Did you mean to say that the indexes get smaller than
what they had been after some normal operation?  If so it's worth noting
this comment from the btree index building code (nbtsort.c):

 * It is not wise to pack the pages entirely full, since then *any*
 * insertion would cause a split (and not only of the leaf page; the need
 * for a split would cascade right up the tree).  The steady-state load
 * factor for btrees is usually estimated at 70%.  We choose to pack leaf
 * pages to the user-controllable fill factor (default 90%) while upper pages
 * are always packed to 70%.  This gives us reasonable density (there aren't
 * many upper pages if the keys are reasonable-size) without risking a lot of
 * cascading splits during early insertions.

As the comment notes, the initial state of a freshly-built index is packed
more densely than what you can expect after a lot of insertions/updates
have occurred.  That's not a bug, it's just a fact of life.

Also, there are certain usage patterns that can result in btree indexes
having densities much lower than the conventional-wisdom 70%.  The main
one I've seen in practice is "decimation", where you delete say 99 out
of every 100 entries in index order.  This leaves just a few live entries
in each leaf page --- but our btree code doesn't reclaim an index page
for recycling until it's totally empty.  So you can end up with a very
low load factor after doing something like that, and a reindex is the
only good way to fix it.

            regards, tom lane


В списке pgsql-performance по дате отправления:

Предыдущее
От: Claudio Freire
Дата:
Сообщение: Re: Bloated tables and why is vacuum full the only option
Следующее
От: Claudio Freire
Дата:
Сообщение: Re: Bloated tables and why is vacuum full the only option