Andy Osborne <andy@sift.co.uk> writes:
> One of our databases crashed yesterday with a bug that looks
> a lot like the non superuser vacuum issue that 7.2.3 was
> intended to fix, although we do our vacuum with a user that
> has usesuper=t in pg_user so I guess it's not that simple.
> FATAL 2: open of /u0/pgdata/pg_clog/0726 failed: No such file or directory
What range of file names do you actually see in pg_clog?
The fixes in 7.2.3 were for cases that would try to access
already-removed clog segments (file numbers less than what's present).
In this case the accessed file name is large enough that I'm thinking
the problem is due to a garbage transaction number being passed to the
transaction-status-check code. So my bet is on physical data corruption
in the table that was causing the problem. It turns out that the first
detectable symptom of a trashed tuple header is often a failure like
this :-(.
You didn't happen to make a physical copy of the news table before
dropping it, did you? It'd be interesting to examine the remains.
So far, the cases I have seen like this all seem to be due to hardware
faults, but we've seen it just often enough to make me wonder if there
is a software issue too.
regards, tom lane