Обсуждение: Performance degradation, index bloat and planner estimates

Поиск
Список
Период
Сортировка

Performance degradation, index bloat and planner estimates

От
Daniele Varrazzo
Дата:
Hello,

we are experiencing some performance degradation on a database where
the main table is running towards the 100M record. Together with the
slowness of the queries I notice these symptoms:

- size bloat of partial indexes
- very bad planning estimates

I'd appreciate any hint to get a better picture of what is going on
and to understand how much the symptoms are correlated.

The most noticeable problems are with queries such as:

   select * from foos where <condition>

where there is a very selective condition (about 10K record over 100M)
and a partial index on them. The index is correctly taken in
consideration for the scan but with an extremely wrong estimate and
painful performance, e.g.:

# explain select count(*), sum(x) from foos where rcon IS NULL AND
is_settled = true;
                                       QUERY PLAN
----------------------------------------------------------------------------------------
 Aggregate  (cost=4774842.01..4774842.02 rows=1 width=8)
   ->  Bitmap Heap Scan on foos  (cost=218211.50..4674496.17
rows=20069167 width=8)
         Recheck Cond: ((rcon IS NULL) AND is_settled)
         ->  Bitmap Index Scan on i_rcon3  (cost=0.00..213194.21
rows=20069167 width=0)

(I don't have an analyze output anymore for this, but the rows
returned were about 7K at the moment). This query used to run in
sub-second time: recently it started taking several minutes or, if run
quickly after a previous run, around 10 seconds.

pg_stat_all_index showed >400M size for this index: way too much to
index <10K records.

Trying to solve this bloat problem I've tried:

1: manually running vacuum on the table (the autovacuum had not
touched it for a while and it seems it avoids it probably because
other table are updated more. The verbose output concerning the above
index was:

...
INFO:  scanned index "i_rcon3" to remove 22369332 row versions
DETAIL:  CPU 0.84s/5.20u sec elapsed 50.18 sec.
...
INFO:  "foos": removed 22369332 row versions in 1009710 pages
DETAIL:  CPU 34.38s/27.01u sec elapsed 2226.51 sec.
...
INFO:  scanned index "i_rcon3" to remove 15330597 row versions
DETAIL:  CPU 0.48s/2.14u sec elapsed 15.42 sec.
...
INFO:  "foos": removed 15330597 row versions in 569208 pages
DETAIL:  CPU 9.40s/8.42u sec elapsed 732.17 sec.
...
INFO:  index "i_rcon3" now contains 43206 row versions in 53495 pages
DETAIL:  9494602 index row versions were removed.
53060 index pages have been deleted, 20032 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
...
WARNING:  relation "public.foos" contains more than "max_fsm_pages"
pages with useful free space
HINT:  Consider using VACUUM FULL on this relation or increasing the
configuration parameter "max_fsm_pages".

Albeit the output was promising, the planner estimate and index size
didn't change very much (I assumed the performance didn't change as
well so I didn't run an explain analyze).

2. I tried to rebuild concurrently an index with exactly the same
properties: this produced an index with a more reasonable size (now,
after a busy weekend running it is about 12M) and this solved the
performance problem. It didn't fix the bad estimate anyway.

3. I increased the statistics from the default 10 to 100 and analyzed
expecting to see some change in the estimated number of rows: apart
from a small fluctuation the estimate remained around the 20M.

4. the index was not indexing a distinct field but rather a fkey with
just no more than 4K distinct values and an extremely uneven
distribution. I created an index with the same condition but on the
pkey but the estimate didn't change: stable on the 20M records even
after increasing the stats to 100 for the pkey field too.

Does anybody have some information about where the bloat is coming
from and what is the best way to get rid of it? Would a vacuum full
fix this kind of problem? Is there a way to fix it without taking the
system offline?

The indexed condition is a state of the evolution of the records in
the table: many records assume that state for some time, then move to
a different state no more indexed. Is the continuous addition/deletion
of records to the index causing the bloat (which can be then
considered limited to the indexes with a similar usage pattern)? Is
reindex/concurrent rebuild the best answer?

Any idea of where the 20M record estimate is coming from? Isn't the
size of the partial index taken into account in the estimate?

We are running PG 8.3, planning for migration on new hardware and
concurrently on a new PG version in the near future. Are our
problematic behaviours known to be fixed in later releases?

Thank you very much. Regards.

-- Daniele

Re: Performance degradation, index bloat and planner estimates

От
Craig Ringer
Дата:
On 20/09/2010 7:59 PM, Daniele Varrazzo wrote:

> Does anybody have some information about where the bloat is coming
> from and what is the best way to get rid of it? Would a vacuum full
> fix this kind of problem? Is there a way to fix it without taking the
> system offline?

It's hard to know where the index bloat comes from. The usual cause I
see reported here is with regular VACUUM FULL use, which doesn't seem to
be a factor in your case.

A VACUUM FULL will not address index bloat; it's more likely to add to
it. You'd want to use CLUSTER instead, but that'll still require an
exclusive lock that takes the table offline for some time. Your current
solution - a concurrent reindex - is your best bet for a workaround
until you find out what's causing the bloat.

If the bloat issue were with relations rather than indexes I'd suspect
free space map issues as you're on 8.3.

http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html

My (poor) understanding is that index-only bloat probably won't be an
FSM issue.

> The indexed condition is a state of the evolution of the records in
> the table: many records assume that state for some time, then move to
> a different state no more indexed. Is the continuous addition/deletion
> of records to the index causing the bloat (which can be then
> considered limited to the indexes with a similar usage pattern)?

Personally I don't know enough to answer that. I would've expected that
proper VACUUMing would address any resulting index bloat, but

> Any idea of where the 20M record estimate is coming from? Isn't the
> size of the partial index taken into account in the estimate?

I'd really help to have EXPLAIN ANALYZE output here.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: Performance degradation, index bloat and planner estimates

От
Tom Lane
Дата:
Craig Ringer <craig@postnewspapers.com.au> writes:
> If the bloat issue were with relations rather than indexes I'd suspect
> free space map issues as you're on 8.3.

> http://www.postgresql.org/docs/8.3/interactive/runtime-config-resource.html

> My (poor) understanding is that index-only bloat probably won't be an
> FSM issue.

Lack of FSM space can hurt indexes too, although I agree that if *only*
indexes are bloating then it's probably not FSM to blame.

>> The indexed condition is a state of the evolution of the records in
>> the table: many records assume that state for some time, then move to
>> a different state no more indexed. Is the continuous addition/deletion
>> of records to the index causing the bloat (which can be then
>> considered limited to the indexes with a similar usage pattern)?

> Personally I don't know enough to answer that. I would've expected that
> proper VACUUMing would address any resulting index bloat, but

Maybe the index fencepost problem?  If your usage pattern involves
creating many records and then deleting most of them, but leaving behind
a few records that are regularly spaced in the index ordering, then you
can end up with a situation where many index pages have only a few
entries.  An example case is creating daily records indexed by date, and
then deleting all but the last-day-of-the-month entries later.  You end
up with index pages only about 1/30th full.  The index cannot be shrunk
because no page is completely empty, but it contains much unused space
--- which can never get re-used either, if you never insert any new keys
in those key ranges.

If this is the problem then reindexing is the only fix.

            regards, tom lane