Обсуждение: Excessive growth of pg_attribute and other system tables
I'm having trouble with physical growth of postgresql system tables. Server is 7.4.6 and there are several databases in the cluster. The autovacuum daemon has been running since the data was restored after an upgrade a few months ago. Unfortunately my system tables are taking an unreasonable amount of space. For example, on one of the databases pg_attribute holds fewer than 10,000 records but is using more than 600 megabytes and the associated indexes are huge, too. Reindexing dropped the total usage for that database from 3.2G to 2.5G and a vacuum full (when I can do it off hours) will probably drop it to around 1.9G. In other words, one system table alone was accounting for around 40% of the storage used by that database. Now that 1.9G still includes other oversized files like pg_index for which the table alone dropped from 48M to 78K with vacuum full. Vacuum full + index on a selection of other tables yielded savings of: pg_depend: 200M pg_type: 120M pg_class: 50M My autovacuum config is running and I do see regular periodic vacuums of these pg_ tables but still they grow. Any ideas on why, in spite of autovacuum, these files are becoming so huge and, more importantly, the best way to keep them under control. Cheers, Steve
On Thursday 17 March 2005 3:15 pm, Steve Crawford wrote: > I'm having trouble with physical growth of postgresql system > tables.... Additional info. The most recent autovacuum entries for the pg_attribute table are: [2005...] Performing: VACUUM ANALYZE "pg_catalog"."pg_attribute" [2005...] table name: tati."pg_catalog"."pg_attribute" [2005...] relid: 1249; relisshared: 0 [2005...] reltuples: 9334.000000; relpages: 82282 [2005...] curr_analyze_count: 6647115; curr_vacuum_count: 861454 [2005...] last_analyze_count: 6647115; last_vacuum_count: 861454 [2005...] analyze_threshold: 9834; vacuum_threshold: 19668 and [2005...] Performing: ANALYZE "pg_catalog"."pg_attribute" [2005...] table name: foo."pg_catalog"."pg_attribute" [2005...] relid: 1249; relisshared: 0 [2005...] reltuples: 4843240.000000; relpages: 82284 [2005...] curr_analyze_count: 6657041; curr_vacuum_count: 862897 [2005...] last_analyze_count: 6657041; last_vacuum_count: 861454 [2005...] analyze_threshold: 4843740; vacuum_threshold: 19668 (Both within past 1 day - dates truncated to avoid line-wrap.) The table currently has just over 9,000 tuples and I have no reason to believe that should have changed substantially. The thresholds and counts seem way off - especially in the second pass. Cheers, Steve
Steve Crawford <scrawford@pinpointresearch.com> writes: > My autovacuum config is running and I do see regular periodic vacuums > of these pg_ tables but still they grow. Do you have the FSM settings set large enough to account for all the free space? Also you might want to check for newer versions of autovacuum. I recall that the earlier releases had bugs that sometimes made it skip vacuuming tables that should be vacuumed. regards, tom lane
On Thursday 17 March 2005 3:51 pm, Tom Lane wrote: > Steve Crawford <scrawford@pinpointresearch.com> writes: > > My autovacuum config is running and I do see regular periodic > > vacuums of these pg_ tables but still they grow. > > Do you have the FSM settings set large enough to account for all > the free space? max_fsm_pages = 20000 max_fsm_relations = 1000 I just poked through the cluster. I have 6 user databases plus template0 and template1. Total tables in the cluster has reached slightly over 1,000 user tables plus all the system tables in the 8 databases. Does this indicate that I need to increase max_fsm_relations to greater than the total number of tables in the cluster? If so it's not a problem as bumping it to 2000 will only cost me ~50k of my 4G memory plus a bit more since max_fsm_pages would have to be increased to at least 32000. > Also you might want to check for newer versions of autovacuum. I > recall that the earlier releases had bugs that sometimes made it > skip vacuuming tables that should be vacuumed. pg_autovacuum is from 7.4.6 release and is showing periodic vacuums of the problem tables. I thought that bug was in some release prior to 7.4.6. Does the bug allow it to show a vacuum taking place but not do it? Cheers, Steve
Steve Crawford <scrawford@pinpointresearch.com> writes: > On Thursday 17 March 2005 3:51 pm, Tom Lane wrote: >> Do you have the FSM settings set large enough to account for all >> the free space? > max_fsm_pages = 20000 > max_fsm_relations = 1000 That doesn't sound like nearly enough pages for a 2G database. 20000 * 8K = 160MB, so if more than 10% of the pages in your DB are dirty you are going to be leaking free space. You probably want max_fsm_pages up around 250,000. > I just poked through the cluster. I have 6 user databases plus > template0 and template1. Total tables in the cluster has reached > slightly over 1,000 user tables plus all the system tables in the 8 > databases. Does this indicate that I need to increase > max_fsm_relations to greater than the total number of tables in the > cluster? That would be a good plan too. > pg_autovacuum is from 7.4.6 release and is showing periodic vacuums of > the problem tables. I thought that bug was in some release prior to > 7.4.6. Does the bug allow it to show a vacuum taking place but not do > it? I don't recall the details of the changes; you could look at the CVS logs to see what got fixed when. But in any case I think we see the main problem: max_fsm_pages too small. regards, tom lane
After takin a swig o' Arrakan spice grog, scrawford@pinpointresearch.com (Steve Crawford) belched out: > On Thursday 17 March 2005 3:51 pm, Tom Lane wrote: >> Steve Crawford <scrawford@pinpointresearch.com> writes: >> > My autovacuum config is running and I do see regular periodic >> > vacuums of these pg_ tables but still they grow. >> >> Do you have the FSM settings set large enough to account for all >> the free space? > > max_fsm_pages = 20000 > max_fsm_relations = 1000 20000 is definitely way too low. It's not enough to track the dead pages in pg_attribute alone, which looks to have the better part of 80K dead pages. I'd increase that to about 200000, straight off. It seems curious that you have so many tuples getting killed off in this table; are you generating a lot of temp tables continually? By the way, you should be vacuuming pg_attribute _way_ more often, as it shouldn't have gotten as big if you did so... Doing a reindex and doing (once!) a VACUUM FULL should help bring the size down; vacuuming the table more often should keep size down... -- output = reverse("moc.liamg" "@" "enworbbc") http://cbbrowne.com/info/postgresql.html "To do is to be." -- Aristotle "To be is to do." -- Socrates "Do be do be do." -- Sinatra "Do be a do bee." -- Miss Sally of Romper Room fame. "Yabba dabba do." -- Fred Flintstone "DO...BEGIN..END" -- Niklaus Wirth
Steve Crawford wrote: >On Thursday 17 March 2005 3:15 pm, Steve Crawford wrote: > > >>I'm having trouble with physical growth of postgresql system >>tables.... >> >> > >Additional info. The most recent autovacuum entries for the >pg_attribute table are: > >[2005...] Performing: VACUUM ANALYZE "pg_catalog"."pg_attribute" >[2005...] table name: tati."pg_catalog"."pg_attribute" >[2005...] relid: 1249; relisshared: 0 >[2005...] reltuples: 9334.000000; relpages: 82282 >[2005...] curr_analyze_count: 6647115; curr_vacuum_count: 861454 >[2005...] last_analyze_count: 6647115; last_vacuum_count: 861454 >[2005...] analyze_threshold: 9834; vacuum_threshold: 19668 > >and > >[2005...] Performing: ANALYZE "pg_catalog"."pg_attribute" >[2005...] table name: foo."pg_catalog"."pg_attribute" >[2005...] relid: 1249; relisshared: 0 >[2005...] reltuples: 4843240.000000; relpages: 82284 >[2005...] curr_analyze_count: 6657041; curr_vacuum_count: 862897 >[2005...] last_analyze_count: 6657041; last_vacuum_count: 861454 >[2005...] analyze_threshold: 4843740; vacuum_threshold: 19668 > >(Both within past 1 day - dates truncated to avoid line-wrap.) The >table currently has just over 9,000 tuples and I have no reason to >believe that should have changed substantially. The thresholds and >counts seem way off - especially in the second pass. > > I believe this discrepancy has to do with the fact that ANALYZE can return some very bogus values for reltuples, where as vacuum always returns an accurate count. I'm not sure how to best handle this.
Tom Lane wrote: >Steve Crawford <scrawford@pinpointresearch.com> writes: > > >>pg_autovacuum is from 7.4.6 release and is showing periodic vacuums of >>the problem tables. I thought that bug was in some release prior to >>7.4.6. Does the bug allow it to show a vacuum taking place but not do >>it? >> >> > >I don't recall the details of the changes; you could look at the CVS >logs to see what got fixed when. But in any case I think we see the >main problem: max_fsm_pages too small. > > The relevant pg_autovacuum bugs where fixed prior to 7.4.6
"Matthew T. O'Connor" <matthew@zeut.net> writes: > I believe this discrepancy has to do with the fact that ANALYZE can > return some very bogus values for reltuples, where as vacuum always > returns an accurate count. I'm not sure how to best handle this. I think 8.0's ANALYZE will do a better estimation job ... at least, Manfred Koizar rewrote the sampling algorithm in hopes of making it more robust. However, given that there are 9334 tuples in 82282 pages, I'd say that autovacuum has already failed Steve rather badly :-(. There shouldn't be more than a couple hundred pages given that number of rows. Perhaps the FSM settings are too small? regards, tom lane
On Monday 21 March 2005 11:40 am, Tom Lane wrote: > However, given that there are 9334 tuples in 82282 pages, I'd say > that autovacuum has already failed Steve rather badly :-(. There > shouldn't be more than a couple hundred pages given that number of > rows. Perhaps the FSM settings are too small? Yup, we've pretty well established that my fsm settings were way too low. I've bumped them up: max_fsm_relations from 1,000 to 3,000 max_fsm_pages from 20,000 to 1,000,000 The slight expenditure of a few meg of RAM on a 4G ram machine will hurt me far less than the incomplete vacuums. I have to schedule some low-volume time to restart the server and vacum-full before I'll see the result. Just to make sure I'm understanding things correctly this time...I originally (mis)understood these as settings related to resources used _during_ vacuuming. My current understanding is that they are basically pointers that track what space is available for reclamation by vaccum and that the amount of fsm resources required depends on both frequency of vacuums and volume of updates/deletes. Questions: 1) Is my revised understanding correct? And if the answer to 1 is yes... 2) What happens with all that free-space information at server restart (ie. does a server restart lead to dead-tuple leakage)? 3) Is (or should) there be logging of the fact that a server has run out of resources to track dead space? 4) Is there a way to query what proportion of the fsm resources are in use and would access to that info be useful to the autovacuum daemon or a system tuner? Cheers, Steve
Steve Crawford <scrawford@pinpointresearch.com> writes: > Just to make sure I'm understanding things correctly this time...I > originally (mis)understood these as settings related to resources > used _during_ vacuuming. My current understanding is that they are > basically pointers that track what space is available for reclamation > by vaccum and that the amount of fsm resources required depends on > both frequency of vacuums and volume of updates/deletes. The FSM is where VACUUM stores pointers to the free space it's found (or created) in each table. Subsequent INSERTs/UPDATEs will use this free space instead of appending to the file. So to prevent table growth, you need enough FSM slots to remember enough free space to satisfy all the INSERTs/UPDATEs between successive VACUUM runs. In practice people tend to allocate enough FSM to cover all of their database, instead of worrying about exactly which pages might contain free space. In a low-update-volume situation you could probably get away with less. > 2) What happens with all that free-space information at server restart > (ie. does a server restart lead to dead-tuple leakage)? Assuming you had a normal database shutdown rather than a crash, it's written out at shutdown and reloaded. In any case, a VACUUM recomputes the info from scratch. > 4) Is there a way to query what proportion of the fsm resources are in > use and would access to that info be useful to the autovacuum daemon > or a system tuner? VACUUM VERBOSE will tell you about this. regards, tom lane
On Monday 21 March 2005 11:40 am, Tom Lane wrote: > "Matthew T. O'Connor" <matthew@zeut.net> writes: > > I believe this discrepancy has to do with the fact that ANALYZE > > can return some very bogus values for reltuples, where as vacuum > > always returns an accurate count. I'm not sure how to best > > handle this. > > I think 8.0's ANALYZE will do a better estimation job ... at least, > Manfred Koizar rewrote the sampling algorithm in hopes of making it > more robust. > > However, given that there are 9334 tuples in 82282 pages, I'd say > that autovacuum has already failed Steve rather badly :-(. There > shouldn't be more than a couple hundred pages given that number of > rows. Perhaps the FSM settings are too small? Results time. FSM settings were too small but the real problem seems to be that pg_autovacuum isn't getting the job done. I ran VACUUM VERBOSE and set postgresql.conf as follows (roughly 2.5 times the amounts suggested by VACUUM VERBOSE results): max_fsm_pages = 500000 max_fsm_relations = 2500 I restarted the server last Friday night and manually did a vacuum full and reindex of each user and system table in every database on Monday. The zzz.pg_attribute table's file size dropped from over 600MB to less than 2MB. It's now three days later and that table has already increased to 29MB. Processes accessing this database do create many temporary tables so the exact count in pg_attribute varies a bit from time to time but always hovers around 9500. I just did a manual VACUUM FULL on pg_attribute and it's back to 1.3MB. Upon completion of the vacuum, I restarted the pg_autovacuum daemon. Following are the autovacuum log entries related to zzz.pg_attribute. Note the growth of reltuples from 9532.000000 to 184720.000000 in that time (what is a millionth of a tuple, anyway?) and the lack of any vacuums performed. 28 12:12 PM] table name: zzz."pg_catalog"."pg_attribute" 28 12:12 PM] relid: 1249; relisshared: 0 28 12:12 PM] reltuples: 9532.000000; relpages: 157 28 12:12 PM] curr_analyze_count: 176294; curr_vacuum_count: 15447 28 12:12 PM] last_analyze_count: 176294; last_vacuum_count: 15447 28 12:12 PM] analyze_threshold: 10032; vacuum_threshold: 20064 28 12:12 PM] added table: zzz."pg_catalog"."pg_attribute" -- 28 03:42 PM] Performing: ANALYZE "pg_catalog"."pg_attribute" 28 03:42 PM] table name: zzz."pg_catalog"."pg_attribute" 28 03:42 PM] relid: 1249; relisshared: 0 28 03:42 PM] reltuples: 20270.000000; relpages: 336 28 03:42 PM] curr_analyze_count: 186374; curr_vacuum_count: 16329 28 03:42 PM] last_analyze_count: 186374; last_vacuum_count: 15447 28 03:42 PM] analyze_threshold: 20770; vacuum_threshold: 20064 -- 28 10:59 PM] Performing: ANALYZE "pg_catalog"."pg_attribute" 28 10:59 PM] table name: zzz."pg_catalog"."pg_attribute" 28 10:59 PM] relid: 1249; relisshared: 0 28 10:59 PM] reltuples: 42591.000000; relpages: 706 28 10:59 PM] curr_analyze_count: 207254; curr_vacuum_count: 18156 28 10:59 PM] last_analyze_count: 207254; last_vacuum_count: 15447 28 10:59 PM] analyze_threshold: 43091; vacuum_threshold: 20064 -- 29 02:03 PM] Performing: ANALYZE "pg_catalog"."pg_attribute" 29 02:03 PM] table name: zzz."pg_catalog"."pg_attribute" 29 02:03 PM] relid: 1249; relisshared: 0 29 02:03 PM] reltuples: 89464.000000; relpages: 1483 29 02:03 PM] curr_analyze_count: 250664; curr_vacuum_count: 21999 29 02:03 PM] last_analyze_count: 250664; last_vacuum_count: 15447 29 02:03 PM] analyze_threshold: 89964; vacuum_threshold: 20064 -- 30 09:20 PM] Performing: ANALYZE "pg_catalog"."pg_attribute" 30 09:20 PM] table name: zzz."pg_catalog"."pg_attribute" 30 09:20 PM] relid: 1249; relisshared: 0 30 09:20 PM] reltuples: 184720.000000; relpages: 3062 30 09:20 PM] curr_analyze_count: 340791; curr_vacuum_count: 29886 30 09:20 PM] last_analyze_count: 340791; last_vacuum_count: 15447 30 09:20 PM] analyze_threshold: 185220; vacuum_threshold: 20064 Should I abandon pg_autovacuum and just do periodic VACUUM ANALYZEs of everything? Should I use settings other than the defaults for pg_autovacuum? Are temporary tables evil and their use to be avoided? Just checked and zzz.pg_attribute is 50% larger than it was when I did the VACUUM FULL at the start of this email. Cheers, Steve
Steve Crawford wrote: >On Monday 21 March 2005 11:40 am, Tom Lane wrote: > > >>"Matthew T. O'Connor" <matthew@zeut.net> writes: >> >> >>>I believe this discrepancy has to do with the fact that ANALYZE >>>can return some very bogus values for reltuples, where as vacuum >>>always returns an accurate count. I'm not sure how to best >>>handle this. >>> >>> >>I think 8.0's ANALYZE will do a better estimation job ... at least, >>Manfred Koizar rewrote the sampling algorithm in hopes of making it >>more robust. >> >>However, given that there are 9334 tuples in 82282 pages, I'd say >>that autovacuum has already failed Steve rather badly :-(. There >>shouldn't be more than a couple hundred pages given that number of >>rows. Perhaps the FSM settings are too small? >> >> > >Results time. FSM settings were too small but the real problem seems >to be that pg_autovacuum isn't getting the job done. > > Yeah, it certainly looks that way... >I ran VACUUM VERBOSE and set postgresql.conf as follows (roughly 2.5 >times the amounts suggested by VACUUM VERBOSE results): >max_fsm_pages = 500000 >max_fsm_relations = 2500 > >I restarted the server last Friday night and manually did a vacuum >full and reindex of each user and system table in every database on >Monday. The zzz.pg_attribute table's file size dropped from over >600MB to less than 2MB. > > Wow, that is some serious bloat. >It's now three days later and that table has already increased to >29MB. Processes accessing this database do create many temporary >tables so the exact count in pg_attribute varies a bit from time to >time but always hovers around 9500. I just did a manual VACUUM FULL >on pg_attribute and it's back to 1.3MB. > > Depending on several factors the "steady state" size of pg_attribute may be several times larger than it's size right after a vacuum full. The problem is not that it may be 29M, but rather that it continues to grow. >Upon completion of the vacuum, I restarted the pg_autovacuum daemon. >Following are the autovacuum log entries related to zzz.pg_attribute. >Note the growth of reltuples from 9532.000000 to 184720.000000 in >that time (what is a millionth of a tuple, anyway?) and the lack of >any vacuums performed. > > I assume that pg_attribute didn't actually grow to 184720 tuples? >28 12:12 PM] table name: zzz."pg_catalog"."pg_attribute" >28 12:12 PM] relid: 1249; relisshared: 0 >28 12:12 PM] reltuples: 9532.000000; relpages: 157 >28 12:12 PM] curr_analyze_count: 176294; curr_vacuum_count: 15447 >28 12:12 PM] last_analyze_count: 176294; last_vacuum_count: 15447 >28 12:12 PM] analyze_threshold: 10032; vacuum_threshold: 20064 >28 12:12 PM] added table: zzz."pg_catalog"."pg_attribute" >-- >28 03:42 PM] Performing: ANALYZE "pg_catalog"."pg_attribute" >28 03:42 PM] table name: zzz."pg_catalog"."pg_attribute" >28 03:42 PM] relid: 1249; relisshared: 0 >28 03:42 PM] reltuples: 20270.000000; relpages: 336 >28 03:42 PM] curr_analyze_count: 186374; curr_vacuum_count: 16329 >28 03:42 PM] last_analyze_count: 186374; last_vacuum_count: 15447 >28 03:42 PM] analyze_threshold: 20770; vacuum_threshold: 20064 >-- >28 10:59 PM] Performing: ANALYZE "pg_catalog"."pg_attribute" >28 10:59 PM] table name: zzz."pg_catalog"."pg_attribute" >28 10:59 PM] relid: 1249; relisshared: 0 >28 10:59 PM] reltuples: 42591.000000; relpages: 706 >28 10:59 PM] curr_analyze_count: 207254; curr_vacuum_count: 18156 >28 10:59 PM] last_analyze_count: 207254; last_vacuum_count: 15447 >28 10:59 PM] analyze_threshold: 43091; vacuum_threshold: 20064 >-- >29 02:03 PM] Performing: ANALYZE "pg_catalog"."pg_attribute" >29 02:03 PM] table name: zzz."pg_catalog"."pg_attribute" >29 02:03 PM] relid: 1249; relisshared: 0 >29 02:03 PM] reltuples: 89464.000000; relpages: 1483 >29 02:03 PM] curr_analyze_count: 250664; curr_vacuum_count: 21999 >29 02:03 PM] last_analyze_count: 250664; last_vacuum_count: 15447 >29 02:03 PM] analyze_threshold: 89964; vacuum_threshold: 20064 >-- >30 09:20 PM] Performing: ANALYZE "pg_catalog"."pg_attribute" >30 09:20 PM] table name: zzz."pg_catalog"."pg_attribute" >30 09:20 PM] relid: 1249; relisshared: 0 >30 09:20 PM] reltuples: 184720.000000; relpages: 3062 >30 09:20 PM] curr_analyze_count: 340791; curr_vacuum_count: 29886 >30 09:20 PM] last_analyze_count: 340791; last_vacuum_count: 15447 >30 09:20 PM] analyze_threshold: 185220; vacuum_threshold: 20064 > >Should I abandon pg_autovacuum and just do periodic VACUUM ANALYZEs of >everything? Should I use settings other than the defaults for >pg_autovacuum? Are temporary tables evil and their use to be avoided? > >Just checked and zzz.pg_attribute is 50% larger than it was when I did >the VACUUM FULL at the start of this email. > I don't think you should you abandon pg_autovacuum, but I do think you should run periodic vacuum commands from cron until this is resolved. The 1st thing you should do it change the pg_autovacuum settings. The default values are very (far too) conservative for a lot of people. Most people seem to have success with settings like: pg_autovacuum -v 300 -V 0.1 -a 200 -A 0.1 Give that a try and see if it helps. From pg_autovacuum's prospective, from 3/28 12:12 PM to 3/30 09:20 PM there have been 29886 - 15447 = 14439 deletes + update commands against the pg_attribute table. Once it sees 20064 deletes + updates it will perform a vacuum. If there have been more during that time, we need to figure out why pg_autovacuum is not seeing them. The thing I don't understand in your numbers is why reltuples is constantly increasing. When pg_autovacuum is reporting a reltuples of 184720, how many tuples does pg_attribute really have? This might be related to the new reltuples estimation code that Tom was talking about recently, if so, 8.0.2 might help. Anyone else have some insight here? Matt
I just wondered if there is a way to change the variable sign from : to $, like \set avar 32 select :avar ; ---> select $avar ; so I can copy and paste my perl code directly to psql when debugging, assuming I do a proper \set. TIA. Ben Kim Developer College of Education Texas A&M University