Обсуждение: Recovering real disk space

Поиск
Список
Период
Сортировка

Recovering real disk space

От
Adam Siegel
Дата:
We have a system that archives data to a postgres database.  The raw
data is confined to one table.  Each record in the table is generally
1500 bytes.  Each record is also associated with a volume name.  During
normal operations, many millions of rows are written to this table.
After sometime the disk gets close to being full.

We have a program that allows the user to offload rows from a table
based upon volume name to a CD.  This is done by doing a select on the
table and then creating a memento record for each row and persisting it
to the CD.  Once the rows are written to the CD a delete is performed on
the table for the off-loaded rows.

select * from packets where volume = 'abc';

... process to write to CD ...

delete from packets where volume = 'abc';

We perform a vacuum full after each mass delete.  This cycle can happen
many times during over a couple of weeks.  We are in a test lab
environment and are generating a lot of data.

One of the problems we have is that the vacuum on the table can take up
to 10 hours.  We also expect to see the physical disk space go down, but
this does not happen.  If we accidently fill up the disk, then all bets
are off and we are unable to recover.  A vacuum never seems to finish
(several days).

How can we physically recover "real" disk space for the rows that were
deleted from the table?  I've heard about free space buffers, but am not
really sure how they work.  Are there configuration items that can be
tweaked to help with vacuuming large tables?

Thanks!!!!



Re: Recovering real disk space

От
Wes
Дата:
On 3/30/05 12:09 PM, "Adam Siegel" <adam@sycamore.us> wrote:

> How can we physically recover "real" disk space for the rows that were
> deleted from the table?

vacuum full

Wes



Re: Recovering real disk space

От
"Guy Rouillier"
Дата:
Adam Siegel wrote:
> We have a system that archives data to a postgres database.  The raw
> data is confined to one table.  Each record in the table is generally
> 1500 bytes.  Each record is also associated with a volume name.
> During normal operations, many millions of rows are written to this
> table. After sometime the disk gets close to being full.
>
> We have a program that allows the user to offload rows from a table
> based upon volume name to a CD.  This is done by doing a select on the
> table and then creating a memento record for each row and persisting
> it to the CD.  Once the rows are written to the CD a delete is
> performed on the table for the off-loaded rows.
>
> select * from packets where volume = 'abc';
>
> ... process to write to CD ...
>
> delete from packets where volume = 'abc';
>
> We perform a vacuum full after each mass delete.  This cycle can
> happen many times during over a couple of weeks.  We are in a test lab
> environment and are generating a lot of data.
>
> One of the problems we have is that the vacuum on the table can take
> up to 10 hours.  We also expect to see the physical disk space go
> down, but this does not happen.  If we accidently fill up the disk,
> then all bets are off and we are unable to recover.  A vacuum never
> seems to finish (several days).
>
> How can we physically recover "real" disk space for the rows that were
> deleted from the table?  I've heard about free space buffers, but am
> not really sure how they work.  Are there configuration items that
> can be tweaked to help with vacuuming large tables?

You mention millions of rows.  You realize that attempting to delete all
those rows will use up massive amounts of transaction log space, right?
We have a data collection system that stores about 2 million rows a day.
To avoid this issue, we have separate tables for each month.  We want to
keep N months available online.  So when a new month starts, we save off
table N-1 to backup and then just drop the table.  You may want to
consider something like this; if you don't have too many volumes, have
separate tables by volume by month.

>
> Thanks!!!!
>
>
>
> ---------------------------(end of
> broadcast)--------------------------- TIP 2: you can get off all
>     lists at once with the unregister command (send "unregister
> YourEmailAddressHere" to majordomo@postgresql.org)



--
Guy Rouillier


Re: Recovering real disk space

От
Bruno Wolff III
Дата:
On Wed, Mar 30, 2005 at 13:09:33 -0500,
  Adam Siegel <adam@sycamore.us> wrote:
>
> We perform a vacuum full after each mass delete.  This cycle can happen
> many times during over a couple of weeks.  We are in a test lab
> environment and are generating a lot of data.
>
> One of the problems we have is that the vacuum on the table can take up
> to 10 hours.  We also expect to see the physical disk space go down, but
> this does not happen.  If we accidently fill up the disk, then all bets
> are off and we are unable to recover.  A vacuum never seems to finish
> (several days).

This may mean that there are open transactions pinning the records you
have deleted so that they aren't being removed by the vacuum.
Also, under some circumstances a CLUSTER can be faster than a VACUUM FULL.

Re: Recovering real disk space

От
Thomas F.O'Connell
Дата:
Isn't this also a symptom of inappropriate FSM settings?

Try running a VACUUM VERBOSE and check the FSM settings at the end.

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC

Strategic Open Source — Open Your i™

http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Apr 3, 2005, at 8:21 AM, Bruno Wolff III wrote:

> On Wed, Mar 30, 2005 at 13:09:33 -0500,
>   Adam Siegel <adam@sycamore.us> wrote:
>>
>> We perform a vacuum full after each mass delete.  This cycle can
>> happen
>> many times during over a couple of weeks.  We are in a test lab
>> environment and are generating a lot of data.
>>
>> One of the problems we have is that the vacuum on the table can take
>> up
>> to 10 hours.  We also expect to see the physical disk space go down,
>> but
>> this does not happen.  If we accidently fill up the disk, then all
>> bets
>> are off and we are unable to recover.  A vacuum never seems to finish
>> (several days).
>
> This may mean that there are open transactions pinning the records you
> have deleted so that they aren't being removed by the vacuum.
> Also, under some circumstances a CLUSTER can be faster than a VACUUM
> FULL.

Re: Recovering real disk space

От
ptjm@interlog.com (Patrick TJ McPhee)
Дата:
In article <424AEB5D.2090204@sycamore.us>,
Adam Siegel <adam@sycamore.us> wrote:

[...]

% We perform a vacuum full after each mass delete.  This cycle can happen
% many times during over a couple of weeks.  We are in a test lab
% environment and are generating a lot of data.
%
% One of the problems we have is that the vacuum on the table can take up
% to 10 hours.  We also expect to see the physical disk space go down, but
% this does not happen.

Try vacuum full verbose next time to see what it's doing.

Try reindexing after the vacuum is done.

You may find an ordinary vacuuum is faster and just as useful as vacuum full
assuming you're filling and deleting from the same table all the time. It
won't free up space, but it will allow you to maintain a high-water mark.

Look at the relpages column in pg_class to see which relations are using
up the most space.

If you're clearing out all the data for a set of tables, drop them and
recreate them.
--

Patrick TJ McPhee
North York  Canada
ptjm@interlog.com