Обсуждение: Unable to determine what has a particular OID
			
				Hi,
		
	We currently have a DB (8.3.7) in production that seem to be taking up more space on the HDD than was anticipated.  
After looking into the the data directory a particular file/oid is around 21 GB (the oid has 21 files), this OID can then be found in the pg_class table, the explanation for the size being that the table has around 12.5 Million rows.  During the inspection of the data directory I observed another OID that is around 8.5GB, but I cannot locate the corresponding entry in the pg_class table (even after dumping the table into a file and grep-ing for the OID in question).
Any other ideas for finding what the mystery OID is ?
Thanks in advance,
Andy
			
		Andy Dale <andy.dale@gmail.com> writes: > After looking into the the data directory a particular file/oid is around 21 > GB (the oid has 21 files), this OID can then be found in the pg_class table, > the explanation for the size being that the table has around 12.5 Million > rows. During the inspection of the data directory I observed another OID > that is around 8.5GB, but I cannot locate the corresponding entry in the > pg_class table (even after dumping the table into a file and grep-ing for > the OID in question). You should be looking at pg_class.relfilenode, not OID. See http://www.postgresql.org/docs/8.3/static/storage.html regards, tom lane
On 18 June 2010 16:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Dale <andy.dale@gmail.com> writes:You should be looking at pg_class.relfilenode, not OID. See
> After looking into the the data directory a particular file/oid is around 21
> GB (the oid has 21 files), this OID can then be found in the pg_class table,
> the explanation for the size being that the table has around 12.5 Million
> rows. During the inspection of the data directory I observed another OID
> that is around 8.5GB, but I cannot locate the corresponding entry in the
> pg_class table (even after dumping the table into a file and grep-ing for
> the OID in question).
http://www.postgresql.org/docs/8.3/static/storage.html
regards, tom lane
Ok, but when I have dumped the pg_class table into a file like so:
-[ RECORD 1 ]--+----------------------------------------------------------------
relname        | pg_type
relnamespace   | 11
reltype        | 71
relowner       | 10
relam          | 0
relfilenode    | 1247
reltablespace  | 0
relpages       | 19
reltuples      | 796
reltoastrelid  | 0
reltoastidxid  | 0
relhasindex    | t
relisshared    | f
relkind        | r
relnatts       | 26
relchecks      | 0
reltriggers    | 0
relukeys       | 0
relfkeys       | 0
relrefs        | 0
relhasoids     | t
relhaspkey     | f
relhasrules    | f
relhassubclass | f
relfrozenxid   | 379
relacl         | {=r/postgres}
reloptions     | 
-[ RECORD 2 ]--+----------------------------------------------------------------
relname        | triggered_update_columns
relnamespace   | 11313
reltype        | 11477
relowner       | 10
relam          | 0
relfilenode    | 11476
reltablespace  | 0
relpages       | 0
reltuples      | 0
reltoastrelid  | 0
Grep-ing  for the correct oid/filenode (110660 in my case), nothing is returned.
Cheers,
Andy 
			
		On Friday 18 June 2010 7:15:48 am Andy Dale wrote: > On 18 June 2010 16:06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Andy Dale <andy.dale@gmail.com> writes: > > > After looking into the the data directory a particular file/oid is > > > around > > > > 21 > > > > > GB (the oid has 21 files), this OID can then be found in the pg_class > > > > table, > > > > > the explanation for the size being that the table has around 12.5 > > > Million rows. During the inspection of the data directory I observed > > > another OID that is around 8.5GB, but I cannot locate the corresponding > > > entry in the pg_class table (even after dumping the table into a file > > > and grep-ing for the OID in question). > > > > You should be looking at pg_class.relfilenode, not OID. See > > http://www.postgresql.org/docs/8.3/static/storage.html > > > > regards, tom lane > > Ok, but when I have dumped the pg_class table into a file like so: > > -[ RECORD 1 > ]--+---------------------------------------------------------------- > relname | pg_type > relnamespace | 11 > reltype | 71 > relowner | 10 > relam | 0 > relfilenode | 1247 > reltablespace | 0 > relpages | 19 > reltuples | 796 > reltoastrelid | 0 > reltoastidxid | 0 > relhasindex | t > relisshared | f > relkind | r > relnatts | 26 > relchecks | 0 > reltriggers | 0 > relukeys | 0 > relfkeys | 0 > relrefs | 0 > relhasoids | t > relhaspkey | f > relhasrules | f > relhassubclass | f > relfrozenxid | 379 > relacl | {=r/postgres} > reloptions | > -[ RECORD 2 > ]--+---------------------------------------------------------------- > relname | triggered_update_columns > relnamespace | 11313 > reltype | 11477 > relowner | 10 > relam | 0 > relfilenode | 11476 > reltablespace | 0 > relpages | 0 > reltuples | 0 > reltoastrelid | 0 > > Grep-ing for the correct oid/filenode (110660 in my case), nothing is > returned. > > Cheers, > > Andy Try: SELECT oid,* from pg_database ; I suspect the relfilnode you are looking at is another database in the cluster. -- Adrian Klaver adrian.klaver@gmail.com
Andy Dale <andy.dale@gmail.com> writes: > On 18 June 2010 16:06, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> You should be looking at pg_class.relfilenode, not OID. See >> http://www.postgresql.org/docs/8.3/static/storage.html > Ok, but when I have dumped the pg_class table into a file like so: > ... > Grep-ing for the correct oid/filenode (110660 in my case), nothing is > returned. Hm. You're sure you're looking at pg_class in the right database? If there really isn't a pg_class entry matching that relfilenode, then the files are orphans and can be removed. There are some scenarios where orphan files can be left behind, but they generally involve database crashes ... have you had any of those? regards, tom lane
Try:
SELECT oid,* from pg_database ;
I suspect the relfilnode you are looking at is another database in the cluster.
--
Adrian Klaver
adrian.klaver@gmail.com
Maybe I am possibly doing that.  I did a du -h --max-depth=1 in the base/ within the data directory.  This showed a single folder that contained ~34 GB.  On performing a ls in this, it showed 2 oids/filenodes that had more than 1 GB (multiple files with .<number> at the end).  The large table I mentioned in the original post was in this folder, so I assumed that the rest of the files in the same directory belonged to the same DB.
Have I made an incorrect assumption ? 
			
		On 18 June 2010 16:37, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Dale <andy.dale@gmail.com> writes:
> On 18 June 2010 16:06, Tom Lane <tgl@sss.pgh.pa.us> wrote:>> You should be looking at pg_class.relfilenode, not OID. See
>> http://www.postgresql.org/docs/8.3/static/storage.html> Ok, but when I have dumped the pg_class table into a file like so:> ...> Grep-ing for the correct oid/filenode (110660 in my case), nothing isHm. You're sure you're looking at pg_class in the right database?
> returned.
If there really isn't a pg_class entry matching that relfilenode,
then the files are orphans and can be removed. There are some scenarios
where orphan files can be left behind, but they generally involve
database crashes ... have you had any of those?
regards, tom lane
To the best of my knowledge we have not had any DB crashes as yet.  I am also replicating the DB using slony, could this be the cause of these weird orphans ? (it is neither of the sl_log_ tables)
			
		Andy Dale <andy.dale@gmail.com> writes:
>> Try:
>> SELECT oid,* from pg_database ;
>>
>> I suspect the relfilnode you are looking at is another database in the
>> cluster.
> Maybe I am possibly doing that.
There isn't any "maybe" involved here.  Follow Adrian's advice and
determine for sure exactly which database that subdirectory represents.
The subdirectory name will match the OID in the pg_database row.
It's possible for the same table OID to be in use in more than one
database, so the fact that you got a match to the other table doesn't
in itself prove you're looking at the right database.
            regards, tom lane
			
		To the best of my knowledge we have not had any DB crashes as yet. I am also replicating the DB using slony, could this be the cause of these weird orphans ? (it is neither of the sl_log_ tables)
Also looking at the timestamps of the files, they all seem to have been created within a very short period of time on the 6th of April
-rw------- 1 postgres postgres      57344 2010-04-06 03:10 110653
-rw------- 1 postgres postgres          0 2010-04-06 03:08 110654
-rw------- 1 postgres postgres      40960 2010-04-06 03:08 110657
-rw------- 1 postgres postgres       8192 2010-04-06 03:08 110658
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:10 110660
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:12 110660.1
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:13 110660.2
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:15 110660.3
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:16 110660.4
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:18 110660.5
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:20 110660.6
-rw------- 1 postgres postgres 1073741824 2010-04-06 03:21 110660.7
-rw------- 1 postgres postgres   26566656 2010-04-06 03:21 110660.8
-rw------- 1 postgres postgres     180224 2010-04-06 03:21 110661
-rw------- 1 postgres postgres       8192 2010-04-06 03:08 110662
-rw------- 1 postgres postgres       8192 2010-04-06 03:08 110663
-rw------- 1 postgres postgres      16384 2010-04-06 03:21 110664
-rw------- 1 postgres postgres  113254400 2010-04-06 03:25 110763
-rw------- 1 postgres postgres          0 2010-04-06 03:25 110765
-rw------- 1 postgres postgres 1073741824 2010-06-03 22:17 112806
On 18 June 2010 17:00, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Dale <andy.dale@gmail.com> writes:>> Try:
>> SELECT oid,* from pg_database ;
>>
>> I suspect the relfilnode you are looking at is another database in the
>> cluster.> Maybe I am possibly doing that.There isn't any "maybe" involved here. Follow Adrian's advice and
determine for sure exactly which database that subdirectory represents.
The subdirectory name will match the OID in the pg_database row.
It's possible for the same table OID to be in use in more than one
database, so the fact that you got a match to the other table doesn't
in itself prove you're looking at the right database.
regards, tom lane
OK, I have just run the query, and both are in the same database. 
Andy Dale <andy.dale@gmail.com> writes:
> OK, I have just run the query, and both are in the same database.
OK.  Given the narrow range of timestamps on the files, the most
likely bet here is that you're looking at the leftovers from a
"CREATE TABLE AS" or similar command that failed partway through and
for some reason didn't clean up the files it'd created.  As I mentioned,
this'd be unsurprising if there'd been a database crash or kill -9
or something similar involved.  Anyway I think you've done enough to
establish that "rm'ing" those files will be safe.  (But if you're
paranoid you might want to save them someplace first.)
            regards, tom lane
			
		On 18 June 2010 17:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andy Dale <andy.dale@gmail.com> writes:> OK, I have just run the query, and both are in the same database.OK. Given the narrow range of timestamps on the files, the most
likely bet here is that you're looking at the leftovers from a
"CREATE TABLE AS" or similar command that failed partway through and
for some reason didn't clean up the files it'd created. As I mentioned,
this'd be unsurprising if there'd been a database crash or kill -9
or something similar involved. Anyway I think you've done enough to
establish that "rm'ing" those files will be safe. (But if you're
paranoid you might want to save them someplace first.)
regards, tom lane
Thanks for confirming this, I will however save/backup the files before removing them (just in case anything goes wrong).
			
		Andy Dale wrote: > I will however save/backup the files before removing them (just in > case anything goes wrong). If you do go through with this, I would recommend doing a complete database dump afterwards using pg_dump/pg_dumpall. That should make it immediately obvious if you broke something with the change. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
Greg Smith <greg@2ndquadrant.com> writes:
> Andy Dale wrote:
>> I will however save/backup the files before removing them (just in
>> case anything goes wrong).
> If you do go through with this, I would recommend doing a complete
> database dump afterwards using pg_dump/pg_dumpall.  That should make it
> immediately obvious if you broke something with the change.
Doing pg_dump would expose the mistake if you'd removed an actual
table's files.  But I'm not sure it would expose it if you removed
an index ...
            regards, tom lane
			
		Tom Lane wrote: > Doing pg_dump would expose the mistake if you'd removed an actual > table's files. But I'm not sure it would expose it if you removed > an index ... > Right, but in theory if you screwed up and accidentally deleted a file holding an index, you could recover from that in the possibly distant future by rebuilding it, with some pain but no expected loss. Whereas if you deleted some data by removing a file, you really want to know that's what you did immediately, so you can put it back before you forget where it all was at. Sometimes people who have fully setup replication for backup purposes ask me if they should continue saving pg_dump output somewhere. I think it's reasonable to generate a dump using it periodically whether or not you intend to save the result permanently, just as a paranoid sanity check that you can still read everything. I don't trust disks and filesystems that much. (If you're reading this and feel the need to write a pro-ZFS essay at this point, consider yourself trolled) -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us