Обсуждение: Cannot vacuum! Stops on first table, pg_type
Hi! This is really strange. Starting from the beginning, I was running postmaster with logs redirected to a file (on a different filesystem). This has been going OK for quite some time. Suddenly, it filled up quickly, and I had to rm it and restart the server this friday. Today, it happened again; the log file area was filled up completely by postgres' 700 Mbyte log file. I threw it away immediately, so unfortunately I don't have these logs handy (I think I can find some on tape, though...umm...). Have set up this machine to use rotatelogs(8) (from apache) now, like on most of my newer machines :) Also, when this happened, two postgres processes went amok filling up the data directory with pg_temp.$$* files. About 25000 files (more or less empty, if memory serves me) before I realized this had happened (maybe a minute or two...?) Ran out of file descriptors, it was really hot for a while :) I shut down postmaster and removed them, since nothing worked anymore while they were there (I guess postgres couldn't create more tempfiles in that dir :) Now, $PGDATA/pg_log is about 480 Meg. Hardly normal, heh? # ls -l /usr/local/pgsql/data/pg_log -rw------- 1 pgsql pgsql 471556096 Aug 29 02:09 /usr/local/pgsql/data/pg_log But: # df -k Filesystem 1K-blocks Used Avail Capacity Mounted on ... /dev/da0s1h 1986495 120813 1706763 7% /usr/local/pgsql/data Used 120 MB, but the file is 480 MB? One of those files with "holes" in them, I presume? Anyway, please note that the PGDATA file system was never filled up, only the log FS. Long story, sorry, but I thought it be best if no details were left out... The problem now: After the last incident, I cannot vacuum one database (the big one)... other DBs are all right, but that doesn't help me... :-/ vacuum stops on the very first table to vacuum, pg_type: StartTransactionCommand query: vacuum; ProcessUtility: vacuum pg_type; DEBUG: --Relation pg_type-- DEBUG: Pages 6: Changed 0, Reapped 2, Empty 0, New 0; Tup 354: Vac 22, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 105, MaxLen109; Re-using: Free/Avail. Space 8160/176; EndEmpty/Avail. Pages 0/1. Elapsed 0/0 sec. Here it can sit for hours (believe me, I've tried!) eating CPU time (top(1) reports 95-100%) Killing the vacuuming psql off (kill -TERM) gives: proc_exit(0) [#0] shmem_exit(0) [#0] exit(0) I seems, the database and it application is working OK, but my guess is some indices are probably broken, so things may fail that I cannot see immediately, and I can't vacuum to fix this... Can I do anything besides pg_dump; dropdb; createdb; psql < dump.sql? This server is live, and I'd rather not take it down for more than seconds, if possible. Also, this should really be recoverable, right? What about the large pg_log file? I use this combo: FreeBSD-3.5 RELEASE Postgres 6.5.2 -- Palle
check out the REINDEX command: http://www.postgresql.org/docs/user/sql-reindex.htm that may/may not help ... On 29 Aug 2000, Palle Girgensohn wrote: > Hi! > > This is really strange. Starting from the beginning, I was running > postmaster with logs redirected to a file (on a different > filesystem). This has been going OK for quite some time. Suddenly, it > filled up quickly, and I had to rm it and restart the server this > friday. Today, it happened again; the log file area was filled up > completely by postgres' 700 Mbyte log file. I threw it away > immediately, so unfortunately I don't have these logs handy (I think I > can find some on tape, though...umm...). Have set up this machine to use > rotatelogs(8) (from apache) now, like on most of my newer machines :) > > Also, when this happened, two postgres processes went amok filling up > the data directory with pg_temp.$$* files. About 25000 files (more or > less empty, if memory serves me) before I realized this had happened > (maybe a minute or two...?) Ran out of file descriptors, it was really > hot for a while :) > > I shut down postmaster and removed them, since nothing worked anymore > while they were there (I guess postgres couldn't create more tempfiles > in that dir :) > > Now, $PGDATA/pg_log is about 480 Meg. Hardly normal, heh? > > # ls -l /usr/local/pgsql/data/pg_log > -rw------- 1 pgsql pgsql 471556096 Aug 29 02:09 /usr/local/pgsql/data/pg_log > > But: > > # df -k > Filesystem 1K-blocks Used Avail Capacity Mounted on > ... > /dev/da0s1h 1986495 120813 1706763 7% /usr/local/pgsql/data > > Used 120 MB, but the file is 480 MB? One of those files with "holes" > in them, I presume? > > Anyway, please note that the PGDATA file system was never filled up, > only the log FS. > > Long story, sorry, but I thought it be best if no details were left > out... > > The problem now: After the last incident, I cannot vacuum one database > (the big one)... other DBs are all right, but that doesn't help > me... :-/ > > vacuum stops on the very first table to vacuum, pg_type: > > StartTransactionCommand > query: vacuum; > ProcessUtility: vacuum pg_type; > DEBUG: --Relation pg_type-- > DEBUG: Pages 6: Changed 0, Reapped 2, Empty 0, New 0; Tup 354: Vac 22, Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 105, MaxLen109; Re-using: Free/Avail. Space 8160/176; EndEmpty/Avail. Pages 0/1. Elapsed 0/0 sec. > > Here it can sit for hours (believe me, I've tried!) eating CPU time > (top(1) reports 95-100%) > > Killing the vacuuming psql off (kill -TERM) gives: > > proc_exit(0) [#0] > shmem_exit(0) [#0] > exit(0) > > I seems, the database and it application is working OK, but my guess > is some indices are probably broken, so things may fail that I cannot > see immediately, and I can't vacuum to fix this... > > Can I do anything besides pg_dump; dropdb; createdb; psql < dump.sql? > This server is live, and I'd rather not take it down for more than > seconds, if possible. Also, this should really be recoverable, right? > What about the large pg_log file? > > I use this combo: > > FreeBSD-3.5 RELEASE > Postgres 6.5.2 > > -- > Palle > Marc G. Fournier ICQ#7615664 IRC Nick: Scrappy Systems Administrator @ hub.org primary: scrappy@hub.org secondary: scrappy@{freebsd|postgresql}.org
(Sorry for double posting, not used to this mail prog yet) The Hermit Hacker <scrappy@hub.org> writes: > check out the REINDEX command: > > http://www.postgresql.org/docs/user/sql-reindex.htm > > that may/may not help ... reindex is not available in 6.5.2, unfortunately... BTW, will dump/restore get everything back, even with the gigantic pg_log? -- Palle
Hello again, OK, I dumped && restored (had to fix a couple of duplicates that had been hidden by indices, not too bad), and updated to 6.5.3 + patches, so I guess that makes me happy for now. I never did like 6.5.2 anyway... ;-) -- Palle
On 29 Aug 2000, Palle Girgensohn wrote: > Hello again, > > OK, I dumped && restored (had to fix a couple of duplicates that had > been hidden by indices, not too bad), and updated to 6.5.3 + patches, > so I guess that makes me happy for now. I never did like 6.5.2 > anyway... ;-) would highly recommend moving to v7.x ...
The Hermit Hacker wrote: > > On 29 Aug 2000, Palle Girgensohn wrote: > > > Hello again, > > > > OK, I dumped && restored (had to fix a couple of duplicates that had > > been hidden by indices, not too bad), and updated to 6.5.3 + patches, > > so I guess that makes me happy for now. I never did like 6.5.2 > > anyway... ;-) > > would highly recommend moving to v7.x ... mmm... We are not through testing our application just yet. There are some distinct SQL differences, like select distinct *on*, for example. But yes, we use pg7 for all development, and I really love it. This is a "release" machine, though, and it isn't easily "tampered with"... need thorough tests first... :( -- Palle