Обсуждение: vacuum can't find clog

Поиск
Список
Период
Сортировка

vacuum can't find clog

От
"Gregory S. Williamson"
Дата:
This is on a linux box, postgres 8.0.3 (yes, I know it's out of date but it was a test setup which seems to become live
andI have not had time to upgrade the thing) which runs a daily process to crunch some logs and do some postGIS
processingon them. 

Tonight I got this message from my program:

Skipping the copy to disk
Done at Sat Jul  1 02:32:37 2006
Starting vacuum at Sat Jul  1 02:32:41 2006
Can't do the final vacuum and stats! ERROR:  could not access status of transaction 1081340514
SQL is <VACUUM ANALYZE;>

Running a vacuum verbose in the afflicted database got this complaint:
DETAIL:  0 dead row versions cannot be removed yet.
There were 19 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  analyzing "pg_catalog.pg_operator"
INFO:  "pg_operator": scanned 14 of 14 pages, containing 651 live rows and 0 dead rows; 651 rows in sample, 651
estimatedtotal rows 
INFO:  vacuuming "public.weekly_log_data"
ERROR:  could not access status of transaction 1081340514
DETAIL:  could not open file "/data/postgres/psql8_test/pg_clog/0407": No such file or directory

Sure enough, there is no such beast:
[postgres@pogo cmnd_logs]$ ls -lt /data/postgres/psql8_test/pg_clog/
total 1716
-rw-------    1 postgres postgres   155648 Jul  1 02:36 0007
-rw-------    1 postgres postgres   262144 Jun 15 02:09 0006
-rw-------    1 postgres postgres   262144 Apr 12 05:27 0005
-rw-------    1 postgres postgres   262144 Mar 12 02:41 0004
-rw-------    1 postgres postgres   262144 Feb 13 00:05 0003
-rw-------    1 postgres postgres   262144 Dec 28  2005 0002
-rw-------    1 postgres postgres   262144 Dec 18  2005 0001

=========================

I can reload the data with some tedium but not effort, but I am wondering if I might have run across a known bug in
8.0.3? Are there any things I should look at ? Or should I just start over on my week's logs on a newer, more up to
dateversion ? (it's less than 5 hours of computer time to crunch one day of this stuff,and only a few minutes of
operatortime, so I can easily cut my losses and run...) 

TIA,

Greg Williamson
DBA
GlobeXplorer LLC

Re: vacuum can't find clog

От
Tom Lane
Дата:
"Gregory S. Williamson" <gsw@globexplorer.com> writes:
> INFO:  vacuuming "public.weekly_log_data"
> ERROR:  could not access status of transaction 1081340514
> DETAIL:  could not open file "/data/postgres/psql8_test/pg_clog/0407": No such file or directory

This is a data-corruption problem, ie, something has stomped on the xmin
or xmax field of a tuple.  (It may or may not be relevant that the high
order half of that number looks like ASCII "s@" ...)  Since xmin/xmax
are the first fields looked at that can be checked with any specificity,
"could not access status of transaction" is a common symptom even when
most of a page has been trashed :-(

I don't recall whether we've fixed any bugs post-8.0.3 that might
explain this.  My experience is that if you inquire in any detail
you tend to find symptoms that look more like OS- or hardware-level
problems than anything that looks like it could have come from a
Postgres bug ... for instance, I recall finding a block of mail-message
text smack in the middle of a Postgres file in one case.

If you care to investigate exactly what happened, you could break out
pg_filedump and other implements of destruction and see if you can
determine exactly what's bogus in that table.  You could probably
also recover whatever data hasn't been overwritten, but the extent
of damage is impossible to guess at this point.

Otherwise, you might just consider this a wake-up call to update your
Postgres and kernel, and maybe run some memory and disk testing.

            regards, tom lane