Обсуждение: orphaned PGDATA/base/ subdirectories
setup: postgresql 8.1.8, x86_64, gnu/linux
I recently used createdb -T to copy a large database and the process
failed due to disk full, but the space wasn't freed after that failure.
That led me to wonder how much of my disk usage was from failed or
interrupted operations.
In the 8.1.8 manual, Chapter 50, I read:
For each database in the cluster there is a subdirectory within
PGDATA/base, named after the database's OID in pg_database. This
subdirectory is the default location for the database's files;
in particular, its system catalogs are stored there.
So, in pg_database I see:
rkh@csb-dev=> select oid,datname from pg_database order by 1;
oid | datname
---------+-----------------------------
1 | template1
10792 | template0
10793 | postgres
1235174 | csb_02_02_2007
1379984 | csb-dev
2121234 | csb
2296808 | csb-dev-snapshot-2007-03-07
2422198 | csb-dev-snapshot-2007-03-08
2558026 | csb-dev-snapshot-2007-03-13
(9 rows)
And in PGDATA/base:
postgres@csb:/srv/postgresql/base> ls -l
total 207
drwx------ 3 postgres postgres 2640 2007-03-12 22:07 1
drwx------ 2 postgres postgres 2576 2007-02-16 17:21 10792
drwx------ 3 postgres postgres 2640 2007-03-12 20:48 10793
drwx------ 3 postgres Bioinfo 21472 2007-03-12 20:47 1235174
drwx------ 2 postgres Bioinfo 3872 2006-10-19 21:25 1363637 ?
drwx------ 2 postgres Bioinfo 1888 2006-10-20 08:32 1371793 ?
drwx------ 3 postgres Bioinfo 31336 2007-03-12 20:48 1379984
drwx------ 3 postgres Bioinfo 21336 2007-03-12 20:51 2121234
drwx------ 3 postgres Bioinfo 30720 2007-03-13 08:57 2296808
drwx------ 3 postgres Bioinfo 30680 2007-03-12 20:47 2422198
drwx------ 2 postgres Bioinfo 21304 2007-03-12 18:50 2549963 ?
drwx------ 2 postgres Bioinfo 20504 2007-03-12 20:47 2549977 ?
drwx------ 2 postgres Bioinfo 4320 2007-03-12 20:47 2550014 ?
drwx------ 2 postgres Bioinfo 18448 2007-03-13 03:57 2558026
I've put ? after the base/ subdirs that have no correspondent in
pg_database.
The manual clearly says that a each row in pg_database will have a
corresponding subdirectory in base/. Is the converse also true, ie,
should every subdir in base/ also have a row in pg_database? If so, am
I safe to rm the unreferenced subdirectories?
FWIW, I know that I've interrupted (^C) a few createdb -T operations
during the course of this installation, and I recently had the disk full
error. My best guess is that these orphaned directories result from
those interruptions and failure.
Thanks,
Reece
--
Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0
./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush
kernel warning: universe consuming too many resources. Killing.
universe killed due to catastrophic leadership. Try -uspres carter.
Reece Hart <reece@harts.net> writes:
> The manual clearly says that a each row in pg_database will have a
> corresponding subdirectory in base/. Is the converse also true, ie,
> should every subdir in base/ also have a row in pg_database? If so, am
> I safe to rm the unreferenced subdirectories?
Yes.
> FWIW, I know that I've interrupted (^C) a few createdb -T operations
> during the course of this installation, and I recently had the disk full
> error. My best guess is that these orphaned directories result from
> those interruptions and failure.
Hm ... 8.1 and up contain code that tries to remove the copied files if a
CREATE DATABASE fails midway through, but it seems it's not working for
you.
It strikes me that if a control-C arrives while the thing is already
doing remove_dbtablespaces(), that would kick it out of that code,
allowing some files to go unreclaimed --- are you in the habit of
banging on ^C repeatedly? I couldn't reproduce such a problem in a
small amount of testing though.
regards, tom lane
On Tue, 2007-03-13 at 17:44 -0400, Tom Lane wrote:
You mean C-c doesn't work like elevator buttons?
It wouldn't surprise me that I hit repeatedly. I certainly remember at least one case where it seemed like I was being ignored and hit C-c repeatedly. It's also quite possible that I logged out from ssh (with ~.), and I have no idea what signal (signals?) that ultimately sends. And, last night I sent KILL when I was convinced that I was being ignored and was about to restart the cluster anyway, for which I'd expect remnants. However, I'm certain of the observation that disk full led to remnants of a partial copy. That job was running in the background when I wasn't logged in.
Thanks,
Reece
are you in the habit of
banging on ^C repeatedly? I couldn't reproduce such a problem in a
small amount of testing though.
You mean C-c doesn't work like elevator buttons?
It wouldn't surprise me that I hit repeatedly. I certainly remember at least one case where it seemed like I was being ignored and hit C-c repeatedly. It's also quite possible that I logged out from ssh (with ~.), and I have no idea what signal (signals?) that ultimately sends. And, last night I sent KILL when I was convinced that I was being ignored and was about to restart the cluster anyway, for which I'd expect remnants. However, I'm certain of the observation that disk full led to remnants of a partial copy. That job was running in the background when I wasn't logged in.
Thanks,
Reece
-- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. |
Reece Hart wrote: > On Tue, 2007-03-13 at 17:44 -0400, Tom Lane wrote: > > > are you in the habit of > > banging on ^C repeatedly? I couldn't reproduce such a problem in a > > small amount of testing though. > > You mean C-c doesn't work like elevator buttons? No, it's much better because you can actually get results quicker if you press it several times. On the other hand it would be scary to have the elevator "change its mind" on the middle of carrying you to 20th and suddenly opening the floor for dumping you to the basement instead. Which one would think is pretty much what must have happened here ... -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Tue, 2007-03-13 at 18:16 -0400, Alvaro Herrera wrote:
Actually, I got off at the floor 7-and-a-half and I discovered a great view through Tom Lane's eyeballs. Here's what it looks like:
Jeez, I have the postgresql source memorized (all versions)... my hands move rapidly between 5 keyboards, I can watch 5 monitors at the same time, and I can respond to 15 emails while coding on 3 branches and running regression tests... this is great!
-Reece
On the other hand it would be scary to have the elevator "change its
mind" on the middle of carrying you to 20th and suddenly opening the
floor for dumping you to the basement instead. Which one would think is
pretty much what must have happened here ...
Actually, I got off at the floor 7-and-a-half and I discovered a great view through Tom Lane's eyeballs. Here's what it looks like:
Jeez, I have the postgresql source memorized (all versions)... my hands move rapidly between 5 keyboards, I can watch 5 monitors at the same time, and I can respond to 15 emails while coding on 3 branches and running regression tests... this is great!
-Reece
-- Reece Hart, http://harts.net/reece/, GPG:0x25EC91A0 ./universe -G 6.672e-11 -e 1.602e-19 -protonmass 1.673e-27 -uspres bush kernel warning: universe consuming too many resources. Killing. universe killed due to catastrophic leadership. Try -uspres carter. |
Reece Hart wrote:
> On Tue, 2007-03-13 at 18:16 -0400, Alvaro Herrera wrote:
>
>> On the other hand it would be scary to have the elevator "change its
>> mind" on the middle of carrying you to 20th and suddenly opening the
>> floor for dumping you to the basement instead. Which one would think
>> is
>> pretty much what must have happened here ...
>
>
> Actually, I got off at the floor 7-and-a-half and I discovered a great
> view through Tom Lane's eyeballs. Here's what it looks like:
>
> Jeez, I have the postgresql source memorized (all versions)... my hands
> move rapidly between 5 keyboards, I can watch 5 monitors at the same
> time, and I can respond to 15 emails while coding on 3 branches and
> running regression tests... this is great!
You insulting buffoon! That is 6 keyboards, 7 monitors and 31 emails!
Joshua D. Drake
>
> -Reece
>
--
=== The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive PostgreSQL solutions since 1997
http://www.commandprompt.com/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/