Обсуждение: Data files became huge with no apparent reason

Поиск
Список
Период
Сортировка

Data files became huge with no apparent reason

От
Dario Fumagalli
Дата:
[(Please disregard my previous post, this is more updated)]

Hello to all,

I'm returned from a 2 weeks vacation and I'm now facing a problem that
is effectively blocking a critical database server.

I have a main database (proj_store) that holds 34 objects (tables and
sequences). Only one table has more than some 100s records (the products
table, totalling 975 records). The backend is PostgreSQL 7.1.1 on a
Caldera Linux OpenServer, compiled from sources.

A full, uncompressed SQL dump (with full inserts) is about 3,4 MB.

Each day, a cron-driven program I wrote one year ago updates the
database from a MS-SQL 7.0 server (inserts and updates only, in the
range of 2-5 each day).

So this database server (hosted on its stand alone corporate pretty high
end server, a Compaq ML-370 PIII PC with RAID, 1 GB RAM etc.) should run
very smootly. And in fact it is at least one year it does so.

But this is not the case any more. In fact it has grown in some abnormal
fashion and now the "du -h" command reports the following:


1.6M    ./data/base/1
1.5M    ./data/base/18719
470M    ./data/base/242014
11M     ./data/base/46821
1.7M    ./data/base/197097
2.3M    ./data/base/279236
488M    ./data/base
750k    ./data/global
16M     ./data/pg_xlog
505M    ./data

The ls -l reports:

drwx------   2 postgres database     2048 Jul 17 16:42 1
drwx------   2 postgres database     2048 May 22  2001 18719
drwx------   2 postgres database     2048 Feb 14  2002 197097
drwx------   2 postgres database     3072 Jul  3 08:35 242014
drwx------   2 postgres database     2048 Jul 17 16:16 279236
drwx------   2 postgres database     3072 Jan 29  2002 46821


That is this DB is 500+ MB!

And to think that I perform VACUUM ANALYZE at least weekly!
The only period I didn't VACUUM the database was during the mentioned 2
weeks vacation.

The database filled its partition and the backend crashed.
This morning, when I was told of the malfunction, I tried to restart the
daemon with:

/etc/rc.d/init.d/postgres start

as always and it blew up.
Then I tried manually and it told me about a FATAL 2 error (searching on
the archives revealed me it means "disk full" and from that I understood
that the database grew too large).

So freed some disk space (52 MB) and the backend finally restarted.


By the way, I have a mirrored develompent machine, last update just
before leaving for vacation.

Here the "du -h" command reports the following "assuring" stats:

1.6M    ./1
1.5M    ./18719
7.3M    ./250600
11M     ./46821
1.7M    ./197097
2.2M    ./259865
26M     .

(Some directories have different names but I assure the contents is the
same).

Now, the questions are:
- How is it possible this exceptional growth (N.B. the database server
has all the default values set in its configuration files - i.e. was not
"optimized")?
- More urgently: why now it takes 30 seconds to perform a select
count(*) on 900 records (no other clients connected), while in the past
it was almost instantaneous? All database operations are now slow as
dogs. And I have to live with this 52 MB until the technician comes with
a new disk (1 week... he is in vacation now).
- Why do the backend crashed immediately if I try to VACUUM (ANALYZE)
it? I think the backend claims additional disk space for this operation
and fills the disk again, but I'm not sure.
- And last, but not least... is it possible to restore the situation
without loosing data (backup is 3 weeks old)? I'm able to start the
daemon and perform SQL operations, but I don't know how to make
PostgreSQL release the disk space after I dumped the database in order
to reload it. And I fear an InitDB will destroy db users and their
privileges.


Thanks in advance,
Dario Fumagalli


Re: Data files became huge with no apparent reason

От
Martijn van Oosterhout
Дата:
On Wed, Aug 28, 2002 at 08:45:39AM +0200, Dario Fumagalli wrote:
> [(Please disregard my previous post, this is more updated)]
>
> Hello to all,
>
> I'm returned from a 2 weeks vacation and I'm now facing a problem that
> is effectively blocking a critical database server.

[snip]

> That is this DB is 500+ MB!
>
> And to think that I perform VACUUM ANALYZE at least weekly!
> The only period I didn't VACUUM the database was during the mentioned 2
> weeks vacation.

Weekly?!? Daily at the very least. Put it in a cronjob.

> The database filled its partition and the backend crashed.
> This morning, when I was told of the malfunction, I tried to restart the
> daemon with:
>
> /etc/rc.d/init.d/postgres start
>
> as always and it blew up.
> Then I tried manually and it told me about a FATAL 2 error (searching on
> the archives revealed me it means "disk full" and from that I understood
> that the database grew too large).

Well, all sorts of things go funny when the disk is full. You didn't mention
what version you're running.

> Now, the questions are:
> - How is it possible this exceptional growth (N.B. the database server
> has all the default values set in its configuration files - i.e. was not
> "optimized")?

VACUUM more often. The more you do between vacuums, the bigger your database
gets.

> - More urgently: why now it takes 30 seconds to perform a select
> count(*) on 900 records (no other clients connected), while in the past
> it was almost instantaneous? All database operations are now slow as
> dogs. And I have to live with this 52 MB until the technician comes with
> a new disk (1 week... he is in vacation now).

Just do a vacuum on the largest table. Do an ls -l in the largest directory
and find the largest table. Do a VACUUM VERBOSE on just that table and post
the results.

> - Why do the backend crashed immediately if I try to VACUUM (ANALYZE)
> it? I think the backend claims additional disk space for this operation
> and fills the disk again, but I'm not sure.

Please paste log output.

> - And last, but not least... is it possible to restore the situation
> without loosing data (backup is 3 weeks old)? I'm able to start the
> daemon and perform SQL operations, but I don't know how to make
> PostgreSQL release the disk space after I dumped the database in order
> to reload it. And I fear an InitDB will destroy db users and their
> privileges.

If you want to do a complete restart, do a pg_dumpall to a file (check the
result). Drop all the databases and reload the backup. That will free
everyting.

Hope this helps,

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.

Re: Data files became huge with no apparent reason

От
"Mario Weilguni"
Дата:
> I'm returned from a 2 weeks vacation and I'm now facing a problem that
> is effectively blocking a critical database server.
>
[snip]
> - And last, but not least... is it possible to restore the situation
> without loosing data (backup is 3 weeks old)? I'm able to start the

not having a daily backup for a critical database server is no good idea, no
matter how stable a database is. think of a hardware crash.

some ideas:
1.first you should try to make a filesystem backup of your database. that
means, stop the database, make a tar file of your data directory, and
restart the database. so you'll have a chance to recover in case your rescue
efforts make things worse.
2. use "vacuum verbose" and remember the location where vacuum crashes. this
will make tracking down the error easier.