Обсуждение: Resolving Index Bloat
Howdy,
Enviroment:
Postgres 8.4.14
Linux
We have a problem with index bloat on a couple of our tables even though we have applied more aggressive autovac/analyze settings in the schema:
ALTER TABLE billingitemrating SET (autovacuum_vacuum_scale_factor=0.001, autovacuum_analyze_scale_factor=0.0001);
ALTER TABLE importitem SET (autovacuum_vacuum_scale_factor=0.001, autovacuum_analyze_scale_factor=0.0001);
pg_stat_all_tables confirms the tables are being auto-vac’d.
This query shows the index bloat:
(postgres@[local]:5432) [smile] > SELECT "relation",
pg_size_pretty(size) as orig_size,
pg_size_pretty(pg_relation_size(C.oid)) new_size,
pg_size_pretty(pg_relation_size(C.oid) - size) as pretty_diff,
round((pg_relation_size(C.oid))::numeric / size * 100.0) || '%' as increase
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
join tablesizes_20121113_1500 on (relation = nspname || '.' || relname)
left join pg_tablespace t on (c.reltablespace = t.oid)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
and size <> 0
order by pg_relation_size(C.oid) - size desc limit 20;
relation | orig_size | new_size | pretty_diff | increase
--------------------------------------------------+-----------+----------+-------------+----------
public.billingitemrating_tariff_idx | 56 MB | 210 MB | 154 MB | 375%
public.billingitemrating_itemdescription_idx | 56 MB | 209 MB | 153 MB | 374%
public.billingitemrating_pkey1 | 50 MB | 170 MB | 120 MB | 339%
public.billingitemrating_psi_idx | 50 MB | 145 MB | 95 MB | 289%
public.billingitemrating_bpid_idx | 45 MB | 129 MB | 84 MB | 289%
vendor.optuswholesalegatewaydataitem | 1290 MB | 1329 MB | 39 MB | 103%
public.billingitemrating | 179 MB | 213 MB | 34 MB | 119%
public.billingitem | 274 MB | 295 MB | 21 MB | 108%
public.importitem_pkey | 130 MB | 147 MB | 17 MB | 113%
public.importitem | 372 MB | 387 MB | 15 MB | 104%
public.importitem_status_ignored_idx | 182 MB | 196 MB | 14 MB | 108%
public.importitem_importitemgroup_status_ignored | 182 MB | 196 MB | 14 MB | 108%
public.importitem_subscriptionid_idx | 163 MB | 176 MB | 13 MB | 108%
public.eventbinding | 122 MB | 135 MB | 13 MB | 111%
public.idx_importitem_importitemgroup | 130 MB | 142 MB | 13 MB | 110%
public.idx_importitem_importitemgroup_status | 130 MB | 140 MB | 10 MB | 108%
public.idx_importitem_status | 130 MB | 140 MB | 10 MB | 108%
public.billingitemrating_biid_idx | 35 MB | 45 MB | 10 MB | 129%
public.billingitemrating_ebid_idx | 35 MB | 45 MB | 10 MB | 128%
vendor.optuswholesalegatewaycdrdescriminator | 254 MB | 263 MB | 9576 kB | 104%
(20 rows)
Time: 849.053 ms
(postgres@[local]:5432) [smile] >
Any ideas on how to resolve?
Thank you,
Samuel Stearns
Samuel -- >________________________________ > From: Samuel Stearns <SStearns@internode.com.au> >To: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org> >Sent: Monday, November 19, 2012 1:59 PM >Subject: [ADMIN] Resolving Index Bloat > > ><...> > >Any ideas on how to resolve? > I have custody of one database that is fairly small but subjected to lots of updated, inserts and deletes, and that databaseshows bloat that we have to knock down hourly or performance tanks. This is on postgres 9.1 so I am not sure how much translates to earlier versions. We ran ""REINDEX" hourly for a while; we currently use a hand rolled script that reindexes each regular index and does adance to reindex primary keys and constraints. I could email it to you but as I said, not sure how much would work. The straight REINDEX did work 99% of the time, with very occasional reports of failures, never repeated. The hand rolledversion is a bit safer and a little less intrusive. HTH, Greg Williamson
Samuel Stearns <SStearns@internode.com.au> writes: > We have a problem with index bloat on a couple of our tables even though we have applied more aggressive autovac/analyzesettings in the schema: Hard to tell much about this without knowing the baseline condition or what's happened since the baseline. It looks like your tables have grown circa 2X (eg billingitemrating), but is that due to new data or heavy update activity? If the baseline condition is freshly-built-or-REINDEXed indexes, a fair amount of "bloat" is to be expected. The traditional rule of thumb about btree indexes is that the steady-state load factor is about two-thirds full. By default, PG builds indexes tightly packed --- so just allowing the index to reach steady state will incur 50% "bloat" on average. It's usually counterproductive to try to maintain a fill factor better than that, unless the table receives only minimal insert/update traffic. (Indeed, usually the better policy for a heavy-update table is to create the indexes with 66% fillfactor to begin with.) Your indexes on billingitemrating seem to have expanded a bit more than what would be expected from the combination of these factors, but I'm not sure they're enormously out of line. You could delve a bit deeper by using contrib/pgstattuple to measure the actual dead space in both the tables and the indexes. Also, it'd be useful to know the data types of the columns being indexed. regards, tom lane
Thanks, Tom. The database was dumped/restored on Monday, 12 October and the autovac settings applied right after. The query showing thebloat was issued on Monday, 19 October so a time period of 1 week elapsed since baseline. Data types in the format <index_name>- <data_type> as follows: billingitemrating_tariff_idx - integer billingitemrating_itemdescription_idx - integer billingitemrating_pkey1 - bigint billingitemrating_psi_idx - integer billingitemrating_bpid_idx - integer importitem_pkey - integer importitem_status_ignored_idx - multicolumn (integer, text) importitem_subscriptionid_idx - text idx_importitem_importitemgroup - integer idx_importitem_importitemgroup_status - multicolumn (integer, integer) idx_importitem_status - integer billingitemrating_biid_idx - integer billingitemrating_ebid_idx - integer I'll have a look at pgstattuple Sam -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Tuesday, 20 November 2012 10:50 AM To: Samuel Stearns Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] Resolving Index Bloat Samuel Stearns <SStearns@internode.com.au> writes: > We have a problem with index bloat on a couple of our tables even though we have applied more aggressive autovac/analyzesettings in the schema: Hard to tell much about this without knowing the baseline condition or what's happened since the baseline. It looks likeyour tables have grown circa 2X (eg billingitemrating), but is that due to new data or heavy update activity? If the baseline condition is freshly-built-or-REINDEXed indexes, a fair amount of "bloat" is to be expected. The traditionalrule of thumb about btree indexes is that the steady-state load factor is about two-thirds full. By default,PG builds indexes tightly packed --- so just allowing the index to reach steady state will incur 50% "bloat" on average. It's usually counterproductive to try to maintain a fill factor better than that, unless the table receivesonly minimal insert/update traffic. (Indeed, usually the better policy for a heavy-update table is to create theindexes with 66% fillfactor to begin with.) Your indexes on billingitemrating seem to have expanded a bit more than what would be expected from the combination of thesefactors, but I'm not sure they're enormously out of line. You could delve a bit deeper by using contrib/pgstattupleto measure the actual dead space in both the tables and the indexes. Also, it'd be useful to know thedata types of the columns being indexed. regards, tom lane
Er, that should read 'November', not 'October'. -----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Samuel Stearns Sent: Tuesday, 20 November 2012 11:26 AM To: Tom Lane Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] Resolving Index Bloat Thanks, Tom. The database was dumped/restored on Monday, 12 October and the autovac settings applied right after. The query showing thebloat was issued on Monday, 19 October so a time period of 1 week elapsed since baseline. Data types in the format <index_name>- <data_type> as follows: billingitemrating_tariff_idx - integer billingitemrating_itemdescription_idx - integer billingitemrating_pkey1 - bigint billingitemrating_psi_idx - integer billingitemrating_bpid_idx - integer importitem_pkey - integer importitem_status_ignored_idx - multicolumn (integer, text) importitem_subscriptionid_idx - text idx_importitem_importitemgroup - integer idx_importitem_importitemgroup_status - multicolumn (integer, integer) idx_importitem_status - integer billingitemrating_biid_idx - integer billingitemrating_ebid_idx - integer I'll have a look at pgstattuple Sam -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Tuesday, 20 November 2012 10:50 AM To: Samuel Stearns Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] Resolving Index Bloat Samuel Stearns <SStearns@internode.com.au> writes: > We have a problem with index bloat on a couple of our tables even though we have applied more aggressive autovac/analyzesettings in the schema: Hard to tell much about this without knowing the baseline condition or what's happened since the baseline. It looks likeyour tables have grown circa 2X (eg billingitemrating), but is that due to new data or heavy update activity? If the baseline condition is freshly-built-or-REINDEXed indexes, a fair amount of "bloat" is to be expected. The traditionalrule of thumb about btree indexes is that the steady-state load factor is about two-thirds full. By default,PG builds indexes tightly packed --- so just allowing the index to reach steady state will incur 50% "bloat" on average. It's usually counterproductive to try to maintain a fill factor better than that, unless the table receivesonly minimal insert/update traffic. (Indeed, usually the better policy for a heavy-update table is to create theindexes with 66% fillfactor to begin with.) Your indexes on billingitemrating seem to have expanded a bit more than what would be expected from the combination of thesefactors, but I'm not sure they're enormously out of line. You could delve a bit deeper by using contrib/pgstattupleto measure the actual dead space in both the tables and the indexes. Also, it'd be useful to know thedata types of the columns being indexed. regards, tom lane -- Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin
Thanks, Greg. I may look at running a scheduled REINDEX. Sam -----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Greg Williamson Sent: Tuesday, 20 November 2012 10:11 AM To: pgsql-admin@postgresql.org Subject: Re: [ADMIN] Resolving Index Bloat Samuel -- >________________________________ > From: Samuel Stearns <SStearns@internode.com.au> >To: "pgsql-admin@postgresql.org" <pgsql-admin@postgresql.org> >Sent: Monday, November 19, 2012 1:59 PM >Subject: [ADMIN] Resolving Index Bloat > > ><...> > >Any ideas on how to resolve? > I have custody of one database that is fairly small but subjected to lots of updated, inserts and deletes, and that databaseshows bloat that we have to knock down hourly or performance tanks. This is on postgres 9.1 so I am not sure how much translates to earlier versions. We ran ""REINDEX" hourly for a while; we currently use a hand rolled script that reindexes each regular index and does adance to reindex primary keys and constraints. I could email it to you but as I said, not sure how much would work. The straight REINDEX did work 99% of the time, with very occasional reports of failures, never repeated. The hand rolledversion is a bit safer and a little less intrusive. HTH, Greg Williamson -- Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin