Обсуждение: index bloat on partial index 8.4

Поиск
Список
Период
Сортировка

index bloat on partial index 8.4

От
Kevin Ricords
Дата:
Per documentation
(http://www.postgresql.org/docs/8.4/static/routine-reindex.html):
In PostgreSQL 7.4 and later, index pages that have become completely
empty are reclaimed for re-use. There is still a possibility for
inefficient use of space: if all but a few index keys on a page have
been deleted, the page remains allocated. So a usage pattern in which
all but a few keys in each range are eventually deleted will see poor
use of space. For such usage patterns, periodic reindexing is recommended.

My implementation is a 150 million row table with a partial index on
newly created rows, where every row will be updated to not match the
index condition every few minutes.
The index size appears to grow proportional to the number of rows added
to the table, but doesn't shrink when rows are updated to no longer meet
the partial index condition.

 >select relname, indexrelname, pg_relation_size(indexrelid) from
pg_stat_user_indexes where relname = 't';
"t";"t_partial_idx";58064896

 >select count(1) from t where t_summarized=false;
34

Even if the 34 rows were each in different pages with a deep index
structure, the index size is not justified, if I understand the
documentation correctly.  I'd guess that 'completely empty' refers to
whether the row exists, rather than whether the row belongs in the index.

Any ideas on this pattern of index bloat other than reindex/rebuilding
periodically?  Has this been addressed in a newer release?

-Kevin

Re: index bloat on partial index 8.4

От
Tom Lane
Дата:
Kevin Ricords <kevin@silverback.com> writes:
> My implementation is a 150 million row table with a partial index on
> newly created rows, where every row will be updated to not match the
> index condition every few minutes.
> The index size appears to grow proportional to the number of rows added
> to the table, but doesn't shrink when rows are updated to no longer meet
> the partial index condition.

Well, a btree index is basically never going to shrink, short of a
rebuild (REINDEX).  The right administrative goal is to prevent it from
growing.  The key issues you need to deal with are (1) making sure it
gets vacuumed often enough; (2) making sure there are not long-lived
transactions that prevent VACUUM from removing recently-dead tuples.
You've not really provided enough data for anyone to guess whether the
problem is (1) or (2) or both.  What's the vacuuming configuration on
your installation?  Have you checked for applications failing to close
their transactions?

            regards, tom lane

Re: index bloat on partial index 8.4

От
Kevin Ricords
Дата:
Thank you for your response.  I believe I misunderstood "reclaimed for
re-use".  Using contrib/pg_freespace, I see empty index pages can be
reused elsewhere in the same index, but are not deallocated.  To keep my
index at the size I expected, I will have to vacuum more frequently or
reindex/rebuild regularly.

-Kevin

On 10/3/2011 11:53 PM, Tom Lane wrote:
> Kevin Ricords<kevin@silverback.com>  writes:
>> My implementation is a 150 million row table with a partial index on
>> newly created rows, where every row will be updated to not match the
>> index condition every few minutes.
>> The index size appears to grow proportional to the number of rows added
>> to the table, but doesn't shrink when rows are updated to no longer meet
>> the partial index condition.
> Well, a btree index is basically never going to shrink, short of a
> rebuild (REINDEX).  The right administrative goal is to prevent it from
> growing.  The key issues you need to deal with are (1) making sure it
> gets vacuumed often enough; (2) making sure there are not long-lived
> transactions that prevent VACUUM from removing recently-dead tuples.
> You've not really provided enough data for anyone to guess whether the
> problem is (1) or (2) or both.  What's the vacuuming configuration on
> your installation?  Have you checked for applications failing to close
> their transactions?
>
>             regards, tom lane
> .
>

Re: index bloat on partial index 8.4

От
Scott Marlowe
Дата:
On Tue, Oct 4, 2011 at 12:28 PM, Kevin Ricords <kevin@silverback.com> wrote:
> Thank you for your response.  I believe I misunderstood "reclaimed for
> re-use".  Using contrib/pg_freespace, I see empty index pages can be reused
> elsewhere in the same index, but are not deallocated.  To keep my index at
> the size I expected, I will have to vacuum more frequently or
> reindex/rebuild regularly.

You might need to adjust the various vacuum_* settings to be able to
vacuum aggressively enough to keep with your high update rate.  Lower
vacuum_cost_delay and / or autovacuum_vacuum_cost_delay to 1 to 5 ms,
or 0 if you've got an impressive enough IO subsystem.  Increate
vacuum_cost_limit to 2 to 10 times what the default is as well.  then
keep a close eye on how busy your IO subsystem gets during peak load.
If it goes up too high (stays at 100% for extended periods). then turn
delay back up / cost back down.

As Greg Smith often points out, the solution to autovacuum problems is
usually to make autovacuum more aggressive not less so.