Обсуждение: Recovering real disk space
We have a system that archives data to a postgres database. The raw data is confined to one table. Each record in the table is generally 1500 bytes. Each record is also associated with a volume name. During normal operations, many millions of rows are written to this table. After sometime the disk gets close to being full. We have a program that allows the user to offload rows from a table based upon volume name to a CD. This is done by doing a select on the table and then creating a memento record for each row and persisting it to the CD. Once the rows are written to the CD a delete is performed on the table for the off-loaded rows. select * from packets where volume = 'abc'; ... process to write to CD ... delete from packets where volume = 'abc'; We perform a vacuum full after each mass delete. This cycle can happen many times during over a couple of weeks. We are in a test lab environment and are generating a lot of data. One of the problems we have is that the vacuum on the table can take up to 10 hours. We also expect to see the physical disk space go down, but this does not happen. If we accidently fill up the disk, then all bets are off and we are unable to recover. A vacuum never seems to finish (several days). How can we physically recover "real" disk space for the rows that were deleted from the table? I've heard about free space buffers, but am not really sure how they work. Are there configuration items that can be tweaked to help with vacuuming large tables? Thanks!!!!
On 3/30/05 12:09 PM, "Adam Siegel" <adam@sycamore.us> wrote: > How can we physically recover "real" disk space for the rows that were > deleted from the table? vacuum full Wes
Adam Siegel wrote: > We have a system that archives data to a postgres database. The raw > data is confined to one table. Each record in the table is generally > 1500 bytes. Each record is also associated with a volume name. > During normal operations, many millions of rows are written to this > table. After sometime the disk gets close to being full. > > We have a program that allows the user to offload rows from a table > based upon volume name to a CD. This is done by doing a select on the > table and then creating a memento record for each row and persisting > it to the CD. Once the rows are written to the CD a delete is > performed on the table for the off-loaded rows. > > select * from packets where volume = 'abc'; > > ... process to write to CD ... > > delete from packets where volume = 'abc'; > > We perform a vacuum full after each mass delete. This cycle can > happen many times during over a couple of weeks. We are in a test lab > environment and are generating a lot of data. > > One of the problems we have is that the vacuum on the table can take > up to 10 hours. We also expect to see the physical disk space go > down, but this does not happen. If we accidently fill up the disk, > then all bets are off and we are unable to recover. A vacuum never > seems to finish (several days). > > How can we physically recover "real" disk space for the rows that were > deleted from the table? I've heard about free space buffers, but am > not really sure how they work. Are there configuration items that > can be tweaked to help with vacuuming large tables? You mention millions of rows. You realize that attempting to delete all those rows will use up massive amounts of transaction log space, right? We have a data collection system that stores about 2 million rows a day. To avoid this issue, we have separate tables for each month. We want to keep N months available online. So when a new month starts, we save off table N-1 to backup and then just drop the table. You may want to consider something like this; if you don't have too many volumes, have separate tables by volume by month. > > Thanks!!!! > > > > ---------------------------(end of > broadcast)--------------------------- TIP 2: you can get off all > lists at once with the unregister command (send "unregister > YourEmailAddressHere" to majordomo@postgresql.org) -- Guy Rouillier
On Wed, Mar 30, 2005 at 13:09:33 -0500, Adam Siegel <adam@sycamore.us> wrote: > > We perform a vacuum full after each mass delete. This cycle can happen > many times during over a couple of weeks. We are in a test lab > environment and are generating a lot of data. > > One of the problems we have is that the vacuum on the table can take up > to 10 hours. We also expect to see the physical disk space go down, but > this does not happen. If we accidently fill up the disk, then all bets > are off and we are unable to recover. A vacuum never seems to finish > (several days). This may mean that there are open transactions pinning the records you have deleted so that they aren't being removed by the vacuum. Also, under some circumstances a CLUSTER can be faster than a VACUUM FULL.
Isn't this also a symptom of inappropriate FSM settings? Try running a VACUUM VERBOSE and check the FSM settings at the end. -tfo -- Thomas F. O'Connell Co-Founder, Information Architect Sitening, LLC Strategic Open Source — Open Your i™ http://www.sitening.com/ 110 30th Avenue North, Suite 6 Nashville, TN 37203-6320 615-260-0005 On Apr 3, 2005, at 8:21 AM, Bruno Wolff III wrote: > On Wed, Mar 30, 2005 at 13:09:33 -0500, > Adam Siegel <adam@sycamore.us> wrote: >> >> We perform a vacuum full after each mass delete. This cycle can >> happen >> many times during over a couple of weeks. We are in a test lab >> environment and are generating a lot of data. >> >> One of the problems we have is that the vacuum on the table can take >> up >> to 10 hours. We also expect to see the physical disk space go down, >> but >> this does not happen. If we accidently fill up the disk, then all >> bets >> are off and we are unable to recover. A vacuum never seems to finish >> (several days). > > This may mean that there are open transactions pinning the records you > have deleted so that they aren't being removed by the vacuum. > Also, under some circumstances a CLUSTER can be faster than a VACUUM > FULL.
In article <424AEB5D.2090204@sycamore.us>, Adam Siegel <adam@sycamore.us> wrote: [...] % We perform a vacuum full after each mass delete. This cycle can happen % many times during over a couple of weeks. We are in a test lab % environment and are generating a lot of data. % % One of the problems we have is that the vacuum on the table can take up % to 10 hours. We also expect to see the physical disk space go down, but % this does not happen. Try vacuum full verbose next time to see what it's doing. Try reindexing after the vacuum is done. You may find an ordinary vacuuum is faster and just as useful as vacuum full assuming you're filling and deleting from the same table all the time. It won't free up space, but it will allow you to maintain a high-water mark. Look at the relpages column in pg_class to see which relations are using up the most space. If you're clearing out all the data for a set of tables, drop them and recreate them. -- Patrick TJ McPhee North York Canada ptjm@interlog.com