Обсуждение: Resolving Index Bloat

Поиск

Список

Период

Сортировка

Resolving Index Bloat

От

Samuel Stearns

Дата:

20 ноября 2012 г., 01:00:13

Howdy,

Enviroment:

Postgres 8.4.14

Linux

We have a problem with index bloat on a couple of our tables even though we have applied more aggressive autovac/analyze settings in the schema:

ALTER TABLE billingitemrating SET (autovacuum_vacuum_scale_factor=0.001, autovacuum_analyze_scale_factor=0.0001);

ALTER TABLE importitem SET (autovacuum_vacuum_scale_factor=0.001, autovacuum_analyze_scale_factor=0.0001);

pg_stat_all_tables confirms the tables are being auto-vac’d.

This query shows the index bloat:

(postgres@[local]:5432) [smile] > SELECT "relation",

pg_size_pretty(size) as orig_size,

pg_size_pretty(pg_relation_size(C.oid)) new_size,

pg_size_pretty(pg_relation_size(C.oid) - size) as pretty_diff,

round((pg_relation_size(C.oid))::numeric / size * 100.0) || '%' as increase

FROM pg_class C

LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)

join tablesizes_20121113_1500 on (relation = nspname || '.' || relname)

left join pg_tablespace t on (c.reltablespace = t.oid)

WHERE nspname NOT IN ('pg_catalog', 'information_schema')

and size <> 0

order by pg_relation_size(C.oid) - size desc limit 20;

relation | orig_size | new_size | pretty_diff | increase

--------------------------------------------------+-----------+----------+-------------+----------

public.billingitemrating_tariff_idx | 56 MB | 210 MB | 154 MB | 375%

public.billingitemrating_itemdescription_idx | 56 MB | 209 MB | 153 MB | 374%

public.billingitemrating_pkey1 | 50 MB | 170 MB | 120 MB | 339%

public.billingitemrating_psi_idx | 50 MB | 145 MB | 95 MB | 289%

public.billingitemrating_bpid_idx | 45 MB | 129 MB | 84 MB | 289%

vendor.optuswholesalegatewaydataitem | 1290 MB | 1329 MB | 39 MB | 103%

public.billingitemrating | 179 MB | 213 MB | 34 MB | 119%

public.billingitem | 274 MB | 295 MB | 21 MB | 108%

public.importitem_pkey | 130 MB | 147 MB | 17 MB | 113%

public.importitem | 372 MB | 387 MB | 15 MB | 104%

public.importitem_status_ignored_idx | 182 MB | 196 MB | 14 MB | 108%

public.importitem_importitemgroup_status_ignored | 182 MB | 196 MB | 14 MB | 108%

public.importitem_subscriptionid_idx | 163 MB | 176 MB | 13 MB | 108%

public.eventbinding | 122 MB | 135 MB | 13 MB | 111%

public.idx_importitem_importitemgroup | 130 MB | 142 MB | 13 MB | 110%

public.idx_importitem_importitemgroup_status | 130 MB | 140 MB | 10 MB | 108%

public.idx_importitem_status | 130 MB | 140 MB | 10 MB | 108%

public.billingitemrating_biid_idx | 35 MB | 45 MB | 10 MB | 129%

public.billingitemrating_ebid_idx | 35 MB | 45 MB | 10 MB | 128%

vendor.optuswholesalegatewaycdrdescriminator | 254 MB | 263 MB | 9576 kB | 104%

(20 rows)

Time: 849.053 ms

(postgres@[local]:5432) [smile] >

Any ideas on how to resolve?

Thank you,

Samuel Stearns

Re: Resolving Index Bloat

От

Greg Williamson

Дата:

20 ноября 2012 г., 02:40:45

Samuel --



>________________________________
> From: Samuel Stearns <SStearns@internode.com.au>
>To: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org>
>Sent: Monday, November 19, 2012 1:59 PM
>Subject: [ADMIN] Resolving Index Bloat
>
>
><...>
>
>Any ideas on how to resolve?
> 


I have custody of one database that is fairly small but subjected to lots of updated, inserts and deletes, and that
databaseshows bloat that we have to knock down hourly or performance tanks. 

This is on postgres 9.1 so I am not sure how much translates to earlier versions.

We ran ""REINDEX" hourly for a while; we currently use a hand rolled script that reindexes each regular index and does
adance to reindex primary keys and constraints. I could email it to you but as I said, not sure how much would work. 

The straight REINDEX did work 99% of the time, with very occasional reports of failures, never repeated. The hand
rolledversion is a bit safer and a little less intrusive. 

HTH,

Greg Williamson

Re: Resolving Index Bloat

От

Tom Lane

Дата:

20 ноября 2012 г., 03:20:20

Samuel Stearns <SStearns@internode.com.au> writes:
> We have a problem with index bloat on a couple of our tables even though we have applied more aggressive
autovac/analyzesettings in the schema: 

Hard to tell much about this without knowing the baseline condition or
what's happened since the baseline.  It looks like your tables have
grown circa 2X (eg billingitemrating), but is that due to new data or
heavy update activity?

If the baseline condition is freshly-built-or-REINDEXed indexes, a fair
amount of "bloat" is to be expected.  The traditional rule of thumb
about btree indexes is that the steady-state load factor is about
two-thirds full.  By default, PG builds indexes tightly packed --- so
just allowing the index to reach steady state will incur 50% "bloat"
on average.  It's usually counterproductive to try to maintain a fill
factor better than that, unless the table receives only minimal
insert/update traffic.  (Indeed, usually the better policy for a
heavy-update table is to create the indexes with 66% fillfactor to begin
with.)

Your indexes on billingitemrating seem to have expanded a bit more than
what would be expected from the combination of these factors, but I'm
not sure they're enormously out of line.  You could delve a bit deeper
by using contrib/pgstattuple to measure the actual dead space in both
the tables and the indexes.  Also, it'd be useful to know the data types
of the columns being indexed.

            regards, tom lane

Re: Resolving Index Bloat

От

Samuel Stearns

Дата:

20 ноября 2012 г., 03:55:40

Thanks, Tom.

The database was dumped/restored on Monday, 12 October and the autovac settings applied right after.  The query showing
thebloat was issued on Monday, 19 October so a time period of 1 week elapsed since baseline.  Data types in the format
<index_name>- <data_type> as follows: 

billingitemrating_tariff_idx - integer
billingitemrating_itemdescription_idx - integer
billingitemrating_pkey1 - bigint
billingitemrating_psi_idx - integer
billingitemrating_bpid_idx - integer
importitem_pkey - integer
importitem_status_ignored_idx - multicolumn (integer, text)
importitem_subscriptionid_idx - text
idx_importitem_importitemgroup - integer
idx_importitem_importitemgroup_status - multicolumn (integer, integer)
idx_importitem_status - integer
billingitemrating_biid_idx - integer
billingitemrating_ebid_idx - integer

I'll have a look at pgstattuple

Sam


-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, 20 November 2012 10:50 AM
To: Samuel Stearns
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Resolving Index Bloat

Samuel Stearns <SStearns@internode.com.au> writes:
> We have a problem with index bloat on a couple of our tables even though we have applied more aggressive
autovac/analyzesettings in the schema: 

Hard to tell much about this without knowing the baseline condition or what's happened since the baseline.  It looks
likeyour tables have grown circa 2X (eg billingitemrating), but is that due to new data or heavy update activity? 

If the baseline condition is freshly-built-or-REINDEXed indexes, a fair amount of "bloat" is to be expected.  The
traditionalrule of thumb about btree indexes is that the steady-state load factor is about two-thirds full.  By
default,PG builds indexes tightly packed --- so just allowing the index to reach steady state will incur 50% "bloat" 
on average.  It's usually counterproductive to try to maintain a fill factor better than that, unless the table
receivesonly minimal insert/update traffic.  (Indeed, usually the better policy for a heavy-update table is to create
theindexes with 66% fillfactor to begin 
with.)

Your indexes on billingitemrating seem to have expanded a bit more than what would be expected from the combination of
thesefactors, but I'm not sure they're enormously out of line.  You could delve a bit deeper by using
contrib/pgstattupleto measure the actual dead space in both the tables and the indexes.  Also, it'd be useful to know
thedata types of the columns being indexed. 

            regards, tom lane

Re: Resolving Index Bloat

От

Samuel Stearns

Дата:

20 ноября 2012 г., 04:05:27

Er, that should read 'November', not 'October'.

-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Samuel Stearns
Sent: Tuesday, 20 November 2012 11:26 AM
To: Tom Lane
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Resolving Index Bloat

Thanks, Tom.

The database was dumped/restored on Monday, 12 October and the autovac settings applied right after.  The query showing
thebloat was issued on Monday, 19 October so a time period of 1 week elapsed since baseline.  Data types in the format
<index_name>- <data_type> as follows: 

billingitemrating_tariff_idx - integer
billingitemrating_itemdescription_idx - integer
billingitemrating_pkey1 - bigint
billingitemrating_psi_idx - integer
billingitemrating_bpid_idx - integer
importitem_pkey - integer
importitem_status_ignored_idx - multicolumn (integer, text)
importitem_subscriptionid_idx - text
idx_importitem_importitemgroup - integer
idx_importitem_importitemgroup_status - multicolumn (integer, integer)
idx_importitem_status - integer
billingitemrating_biid_idx - integer
billingitemrating_ebid_idx - integer

I'll have a look at pgstattuple

Sam

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Tuesday, 20 November 2012 10:50 AM
To: Samuel Stearns
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Resolving Index Bloat

Samuel Stearns <SStearns@internode.com.au> writes:
> We have a problem with index bloat on a couple of our tables even though we have applied more aggressive
autovac/analyzesettings in the schema: 

Hard to tell much about this without knowing the baseline condition or what's happened since the baseline.  It looks
likeyour tables have grown circa 2X (eg billingitemrating), but is that due to new data or heavy update activity? 

If the baseline condition is freshly-built-or-REINDEXed indexes, a fair amount of "bloat" is to be expected.  The
traditionalrule of thumb about btree indexes is that the steady-state load factor is about two-thirds full.  By
default,PG builds indexes tightly packed --- so just allowing the index to reach steady state will incur 50% "bloat" 
on average.  It's usually counterproductive to try to maintain a fill factor better than that, unless the table
receivesonly minimal insert/update traffic.  (Indeed, usually the better policy for a heavy-update table is to create
theindexes with 66% fillfactor to begin 
with.)

Your indexes on billingitemrating seem to have expanded a bit more than what would be expected from the combination of
thesefactors, but I'm not sure they're enormously out of line.  You could delve a bit deeper by using
contrib/pgstattupleto measure the actual dead space in both the tables and the indexes.  Also, it'd be useful to know
thedata types of the columns being indexed. 

            regards, tom lane

--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Re: Resolving Index Bloat

От

Samuel Stearns

Дата:

21 ноября 2012 г., 01:03:16

Thanks, Greg.

I may look at running a scheduled REINDEX.

Sam


-----Original Message-----
From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Greg Williamson
Sent: Tuesday, 20 November 2012 10:11 AM
To: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] Resolving Index Bloat

Samuel --



>________________________________
> From: Samuel Stearns <SStearns@internode.com.au>
>To: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org>
>Sent: Monday, November 19, 2012 1:59 PM
>Subject: [ADMIN] Resolving Index Bloat
>
>
><...>
>
>Any ideas on how to resolve?
> 


I have custody of one database that is fairly small but subjected to lots of updated, inserts and deletes, and that
databaseshows bloat that we have to knock down hourly or performance tanks. 

This is on postgres 9.1 so I am not sure how much translates to earlier versions.

We ran ""REINDEX" hourly for a while; we currently use a hand rolled script that reindexes each regular index and does
adance to reindex primary keys and constraints. I could email it to you but as I said, not sure how much would work. 

The straight REINDEX did work 99% of the time, with very occasional reports of failures, never repeated. The hand
rolledversion is a bit safer and a little less intrusive. 

HTH,

Greg Williamson


--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Resolving Index Bloat

Resolving Index Bloat

Re: Resolving Index Bloat

Re: Resolving Index Bloat

Re: Resolving Index Bloat

Re: Resolving Index Bloat

Re: Resolving Index Bloat