Обсуждение: Giant Postgresql Database Cluster Folder

Поиск
Список
Период
Сортировка

Giant Postgresql Database Cluster Folder

От
Kiruba suthan
Дата:


 

We are using Postgresql Database server to host DB for an analytical tool I am working on. Basically this DB has metrics about files. Row count of few tables are more than 18 million. But the content of the tables are basic data types like numbers, date & string. No binary data is stored. The size of the DB is around 1 GB when taken a full dump.

 

We query the DB using complex views to get reports and most of the result set of these queries are quite huge (row count in hundreds of thousand or in million).

 

The size of the DB Cluster Folder varies between 400GB to 600GB which is unreasonably huge for the actual data. It is eating up all disk-space in the server. When I create a fresh DB from the dump in a new server the size of the DB cluster folder is around 2.3 GB which is very reasonable to me.


Experts,

Could you help me how to clean up DB Cluster folder and reclaim disk space please? And please give me some insight into how data is organized in DB Cluster and what should I do to avoid this happening again?

 

Size of directories under DB Cluster Folder is mentioned below

[user@server DB_CLUSER_DATA]$ du -ksh *
407G    base
316K    global
49M     pg_clog
4.0K    pg_hba.conf
4.0K    pg_ident.conf
120K    pg_multixact
12K     pg_notify
32K     pg_stat_tmp
88K     pg_subtrans
4.0K    pg_tblspc
4.0K    pg_twophase
4.0K    PG_VERSION
129M    pg_xlog
20K     postgresql.conf
4.0K    postmaster.opts
4.0K    postmaster.pid


Thank you very much!

 

Regards,

Kiruba

Re: Giant Postgresql Database Cluster Folder

От
raghu ram
Дата:


On Tue, Apr 10, 2012 at 11:21 AM, Kiruba suthan <kirubasuthan@gmail.com> wrote:


 

We are using Postgresql Database server to host DB for an analytical tool I am working on. Basically this DB has metrics about files. Row count of few tables are more than 18 million. But the content of the tables are basic data types like numbers, date & string. No binary data is stored. The size of the DB is around 1 GB when taken a full dump.

 

We query the DB using complex views to get reports and most of the result set of these queries are quite huge (row count in hundreds of thousand or in million).

 

The size of the DB Cluster Folder varies between 400GB to 600GB which is unreasonably huge for the actual data. It is eating up all disk-space in the server. When I create a fresh DB from the dump in a new server the size of the DB cluster folder is around 2.3 GB which is very reasonable to me.


Experts,

Could you help me how to clean up DB Cluster folder and reclaim disk space please? And please give me some insight into how data is organized in DB Cluster and what should I do to avoid this happening again?

 

Size of directories under DB Cluster Folder is mentioned below

[user@server DB_CLUSER_DATA]$ du -ksh *
407G    base
316K    global
49M     pg_clog
4.0K    pg_hba.conf
4.0K    pg_ident.conf
120K    pg_multixact
12K     pg_notify
32K     pg_stat_tmp
88K     pg_subtrans
4.0K    pg_tblspc
4.0K    pg_twophase
4.0K    PG_VERSION
129M    pg_xlog
20K     postgresql.conf
4.0K    postmaster.opts
4.0K    postmaster.pid




Perform VACUUM FULL on entire cluster to reclaim space. This operation will put Database Objects in Exclusive lock mode,so requesting you to do this activity in non-peak hours.

Once you completed above maintenance activity,then schedule manual VACUUM ANALYZE on every day once,so that dead rows space will be reused while inserting new records into a table.

--Raghu

Re: Giant Postgresql Database Cluster Folder

От
robin
Дата:
Given your data set is small so a dump/restore cycle is painless, you
should also consider upgrading to the latest stable release.

I agree with Raghu that you need to vacuum your database periodically
to keep it's footprint under control, but rather than doing the VACUUM
commands via scheduled activity you should use the auto vacuum daemon
unless you have specific needs.  You probably do need to do the first
vacuum by hand with no users connected (or do a dump/restore cycle which
is perhaps quicker with a system as bloated as yours).

http://www.postgresql.org/docs/8.2/static/routine-vacuuming.html#AUTOVACUUM

Given you missed the need to vacuum I would read the whole routine
maintenance section in case there are other activities that you should
be performing to keep your data safe and available:

http://www.postgresql.org/docs/8.2/static/maintenance.html

Hope this helps,

Robin


On Tue, 10 Apr 2012 11:44:08 +0530, raghu ram wrote:
> On Tue, Apr 10, 2012 at 11:21 AM, Kiruba suthan  wrote:
>
>> Could you help me how to clean up DB Cluster folder and reclaim disk
>> space please? And please give me some insight into how data is
>> organized in DB Cluster and what should I do to avoid this happening
>> again?