Обсуждение: Disk Utilization Increases And Time for Vacuum Increases.
========================================================= Disk Utilization Increases And Time for Vacuum Increases. ========================================================= As reported in some threads before the Disk Utilization of Postgres Database increases as time passes. I have taken all the steps that were suggested in the mailing list and still see that the disk utilization keeps growing. One action we have consiously not done is "REINDEX" on the table. We want to avoid that as far as possible. I have given my detailed setup and test plan and results. Any suggestions on how we can take care of this disk utilization would be appreciated. ********************************************* Setup: ********************************************* ====== schema ====== create table stressdb ( i integer not null, s1 varchar(1024) not null, s2 varchar(1024) not null, s3 varchar(1024) not null, b lo ); alter table stressdb add constraint pk_stressdb primary key (i); create index idx_stressdb_1 on stressdb (s1); create index idx_stressdb_2 on stressdb (s2); create index idx_stressdb_3 on stressdb (s3); create index idx_stressdb_4 on stressdb (i, s1, s2, s3); create index idx_stressdb_5 on stressdb (s1, s2, s3); ====== ===================== Postgres version: 7.2 ===================== =============== postgresql.conf =============== <--snip--> shared_buffers = 8192 max_fsm_relations = 200 max_fsm_pages = 1000000 sort_mem = 32768 vacuum_mem = 16384 <--snip--> ********************************************* Stress Test: ********************************************* loop { Add 100000 rows Delete 100000 rows Run Vacuum pg_unlink all orphaned items from pg_largeobjects } ********************************************* Results: ********************************************* 1: Time taken to run vacuum increases in each iteration. 2: disk utilization increases with each iteration. # df -k Filesystem 1K-blocks Used Available Use% Mounted on START STATS: /dev/sda6 1968620 110004 1838616 6% /var iteration 1: /dev/sda6 1968620 435436 1513184 23% /var iteration 2: /dev/sda6 1968620 547916 1400704 29% /var iteration 3: /dev/sda6 1968620 725872 1222748 38% /var iteration 4: /dev/sda6 1968620 985488 963132 51% /var 3: relpages for the indices increases with the first few iterations and then stays constant. This is determined by running: "SELECT RELNAME, RELPAGES, RELTUPLES FROM PG_CLASS" NOTE: Also the reltuples always stay at 0 (zero). I dont know how to interpret that. What does it mean. 1st iteration relname | relpages | reltuples -------------------------------+----------+----------- idx_stressdb_4 | 1697 | 0 idx_stressdb_5 | 1317 | 0 idx_stressdb_1 | 603 | 0 idx_stressdb_3 | 593 | 0 idx_stressdb_2 | 588 | 0 pk_stressdb | 306 | 0 2nd iteration relname | relpages | reltuples ---------------------------------+----------+----------- idx_stressdb_5 | 2639 | 0 idx_stressdb_4 | 1697 | 0 idx_stressdb_1 | 1196 | 0 idx_stressdb_3 | 1180 | 0 idx_stressdb_2 | 1178 | 0 pk_stressdb | 306 | 0 ** 5th ** iteration relname | relpages | reltuples ---------------------------------+----------+----------- idx_stressdb_5 | 3945 | 0 idx_stressdb_1 | 1785 | 0 idx_stressdb_2 | 1779 | 0 idx_stressdb_3 | 1760 | 0 idx_stressdb_4 | 1697 | 0 pk_stressdb | 306 | 0 __________________________________ Do you Yahoo!? Yahoo! SiteBuilder - Free, easy-to-use web site design software http://sitebuilder.yahoo.com
On Fri, 22 Aug 2003, Vikram D. Gaitonde wrote: > One action we have consiously not done is "REINDEX" on the > table. We want to avoid that as far as possible. Why do you want to avoid this? It may be the only option you have, other than a database dump and reload. > 3: relpages for the indices increases with the first few > iterations and then stays constant. > > This is determined by running: > "SELECT RELNAME, RELPAGES, RELTUPLES FROM PG_CLASS" This seems odd. I would have thought for sure that it was your indexes growing. Maybe you should try looking at the individual files in the data directory to see which ones are growing, and then use the filename to look up the item in pg_class via the relfilenode column to find out what your problem objects are. > NOTE: Also the reltuples always stay at 0 (zero). > I dont know how to interpret that. What does > it mean. It means you haven't done an ANALYZE after changing the size of the database. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.NetBSD.org Don't you know, in this new Dark Age, we're all light. --XTC
>One action we have consiously not done is "REINDEX" on the >table. We want to avoid that as far as possible. Why? It's usually a very painless step, unless the table has millions and millions of rows. We reindex tables with multiple indexes and several million rows on a weekly basis. The only downside is that the table seems to be quite "busy" during the process, which only takes a few minutes. AFAIK, VACUUM doesn't reclaim space taken up by indexes. In fact, the more deletes you do, the larger the index space gets, and the *slower* the index performs. A periodic REINDEX cleans up a lot of problems.
On Mon, 25 Aug 2003, Jeff Boes wrote: > >One action we have consiously not done is "REINDEX" on the > >table. We want to avoid that as far as possible. > > Why? It's usually a very painless step, unless the table has millions and > millions of rows. We reindex tables with multiple indexes and several million > rows on a weekly basis. The only downside is that the table seems to be quite > "busy" during the process, which only takes a few minutes. > > AFAIK, VACUUM doesn't reclaim space taken up by indexes. In fact, the more > deletes you do, the larger the index space gets, and the *slower* the index > performs. A periodic REINDEX cleans up a lot of problems. Note that in 7.4 the fix for this is in, so if you have a chance to test it out with your indexes and their growth problem please test it to see if it works right. I haven't tested 7.4 beta1 yet very hard, just on my workstation, with relatively low level stuff.
On Fri, Aug 22, 2003 at 12:09:00PM -0700, Vikram D. Gaitonde wrote: > One action we have consiously not done is "REINDEX" on the > table. We want to avoid that as far as possible. I'm afraid that's the problem, though. The code cannot recover freed btree pages which can't be recycled for others to use. There is no choice but to reindex certain indexes. You can use pgstattuple and oid2name to get some idea of what is taking all the space, and thereby be selective. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Tue, 26 Aug 2003, Andrew Sullivan wrote: > I'm afraid that's the problem, though. The code cannot recover freed > btree pages which can't be recycled for others to use. There is no > choice but to reindex certain indexes. While we're at it, anybody got a clue on how to reindex system tables? Our unit testing framework (yes, we unit test all of our database code) has to do a lot of schema loads and drops, and the indexes for the system tables (particularly pg_attribute) tend to grow dramatically, slowing performance. (The total size of the pg_catalog starts out around a few megabytes, and doesn't take too long to grow to several hundred megabytes.) So far, the only way I've found to fix this is to do a complete dump and reload of the database, but that's a pain since we have tables with mutual constraints that can't be reloaded without modifying the dump file. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.NetBSD.org Don't you know, in this new Dark Age, we're all light. --XTC
scott.marlowe wrote: > On Mon, 25 Aug 2003, Jeff Boes wrote: > > > >One action we have consiously not done is "REINDEX" on the > > >table. We want to avoid that as far as possible. > > > > Why? It's usually a very painless step, unless the table has millions and > > millions of rows. We reindex tables with multiple indexes and several million > > rows on a weekly basis. The only downside is that the table seems to be quite > > "busy" during the process, which only takes a few minutes. > > > > AFAIK, VACUUM doesn't reclaim space taken up by indexes. In fact, the more > > deletes you do, the larger the index space gets, and the *slower* the index > > performs. A periodic REINDEX cleans up a lot of problems. > > Note that in 7.4 the fix for this is in, so if you have a chance to test > it out with your indexes and their growth problem please test it to see if > it works right. > > I haven't tested 7.4 beta1 yet very hard, just on my workstation, with > relatively low level stuff. I am not sure we have completely dealt with index growth in 7.4. What we have new in 7.4 is the ability for non-FULL VACUUM to collect info on free index pages and reuse them. However, VACUUM FULL does not shrink the index table unless those pages are the last pages of the file. (Could it shift free pages to the end and then truncate index?) Also, does VACUUM FULL on an index put the empty index pages in the FSM map? It doesn't do that for heap pages because there are none after vacuum, but there might be free index pages that we should record. Interesting I found a reference of doing an auto-reindex as part of VACUUM FULL: #ifdef NOT_USED /* * reindex in VACUUM is dangerous under WAL. ifdef out until it * becomes safe. */ if (reindex) { vac_close_indexes(nindexes, Irel); Irel = (Relation *) NULL; activate_indexes_of_a_table(onerel, false); } #endif /* NOT_USED */ -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Curt Sampson <cjs@cynic.net> writes: > While we're at it, anybody got a clue on how to reindex system tables? The procedure given in the REINDEX reference page doesn't work for you? regards, tom lane
On Wed, 27 Aug 2003, Tom Lane wrote: > Curt Sampson <cjs@cynic.net> writes: > > While we're at it, anybody got a clue on how to reindex system tables? > The procedure given in the REINDEX reference page doesn't work for you? You mean with the standalone backend? All the other developers get pissed at me when I take down the database cluster. :-) Heck, I get pissed at me because some other thing I was running in the background goes and blows up. I'm not actually sure why there would be a problem (aside from performance) in removing and rebuilding an index on a system table, unless there are cached query plans somewhere that would try to use the old or now-nonexistent index. cjs -- Curt Sampson <cjs@cynic.net> +81 90 7737 2974 http://www.NetBSD.org Don't you know, in this new Dark Age, we're all light. --XTC
Curt Sampson <cjs@cynic.net> writes: > I'm not actually sure why there would be a problem (aside from > performance) in removing and rebuilding an index on a system table, All the hard-coded low-level stuff that expects the index to be there blows up. This might possibly be made to work for indexes on inessential catalogs like pg_statistic, but I can't imagine it working for pg_attribute... regards, tom lane
Bruce Momjian wrote: > > > AFAIK, VACUUM doesn't reclaim space taken up by indexes. In fact, the more > > > deletes you do, the larger the index space gets, and the *slower* the index > > > performs. A periodic REINDEX cleans up a lot of problems. > > > > Note that in 7.4 the fix for this is in, so if you have a chance to test > > it out with your indexes and their growth problem please test it to see if > > it works right. > > > > I haven't tested 7.4 beta1 yet very hard, just on my workstation, with > > relatively low level stuff. > > I am not sure we have completely dealt with index growth in 7.4. What > we have new in 7.4 is the ability for non-FULL VACUUM to collect info on > free index pages and reuse them. > > However, VACUUM FULL does not shrink the index table unless those pages > are the last pages of the file. (Could it shift free pages to the end > and then truncate index?) Also, does VACUUM FULL on an index put the > empty index pages in the FSM map? It doesn't do that for heap pages > because there are none after vacuum, but there might be free index pages > that we should record. I haven't seen anyone comment on this. I think we need to address this for 7.4. (Collecting index free space in FSM is new in 7.4.) I am concerned that people who are only running VACUUM FULL at night will have problems with index growth. Also, why can't we move totally-empty index pages to the end and truncate the file? > Interesting I found a reference of doing an auto-reindex as part of > VACUUM FULL: > > #ifdef NOT_USED > > /* > * reindex in VACUUM is dangerous under WAL. ifdef out until it > * becomes safe. > */ > if (reindex) > { > vac_close_indexes(nindexes, Irel); > Irel = (Relation *) NULL; > activate_indexes_of_a_table(onerel, false); > } > #endif /* NOT_USED */ Will we ever reindex as part of VACUUM FULL? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
I have looked at the code and it seems VACUUM FULL does collect index free space into the FSM via index_vacuum_cleanup(). Is that true? --------------------------------------------------------------------------- Bruce Momjian wrote: > Bruce Momjian wrote: > > > > AFAIK, VACUUM doesn't reclaim space taken up by indexes. In fact, the more > > > > deletes you do, the larger the index space gets, and the *slower* the index > > > > performs. A periodic REINDEX cleans up a lot of problems. > > > > > > Note that in 7.4 the fix for this is in, so if you have a chance to test > > > it out with your indexes and their growth problem please test it to see if > > > it works right. > > > > > > I haven't tested 7.4 beta1 yet very hard, just on my workstation, with > > > relatively low level stuff. > > > > I am not sure we have completely dealt with index growth in 7.4. What > > we have new in 7.4 is the ability for non-FULL VACUUM to collect info on > > free index pages and reuse them. > > > > However, VACUUM FULL does not shrink the index table unless those pages > > are the last pages of the file. (Could it shift free pages to the end > > and then truncate index?) Also, does VACUUM FULL on an index put the > > empty index pages in the FSM map? It doesn't do that for heap pages > > because there are none after vacuum, but there might be free index pages > > that we should record. > > I haven't seen anyone comment on this. I think we need to address this > for 7.4. (Collecting index free space in FSM is new in 7.4.) I am > concerned that people who are only running VACUUM FULL at night will > have problems with index growth. Also, why can't we move totally-empty > index pages to the end and truncate the file? > > > Interesting I found a reference of doing an auto-reindex as part of > > VACUUM FULL: > > > > #ifdef NOT_USED > > > > /* > > * reindex in VACUUM is dangerous under WAL. ifdef out until it > > * becomes safe. > > */ > > if (reindex) > > { > > vac_close_indexes(nindexes, Irel); > > Irel = (Relation *) NULL; > > activate_indexes_of_a_table(onerel, false); > > } > > #endif /* NOT_USED */ > > Will we ever reindex as part of VACUUM FULL? > > -- > Bruce Momjian | http://candle.pha.pa.us > pgman@candle.pha.pa.us | (610) 359-1001 > + If your life is a hard drive, | 13 Roberts Road > + Christ can be your backup. | Newtown Square, Pennsylvania 19073 > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I have looked at the code and it seems VACUUM FULL does collect index > free space into the FSM via index_vacuum_cleanup(). Is that true? Yes. regards, tom lane