Обсуждение: PANIC: block 463 unfound during REDO after out of disk space failure during VACUUM
Hi everyone Was running a VACUUM on a database on a partition which was running out of disk space. During VACUUM the server process died and failed to restart. Running PostgreSQL 8.1.4 I basically want to get the system back up and running ASAP with as little data loss as possible. All and any help is greatly appreciated. Here is output from error log: Jan 11 15:02:32 marshall postgres[71515]: [2-1] WARNING: terminating connection because of crash of another server process Jan 11 15:02:32 marshall postgres[71515]: [2-2] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server Jan 11 15:02:32 marshall postgres[71515]: [2-3] process exited abnormally and possibly corrupted shared memory. Jan 11 15:02:32 marshall postgres[71515]: [2-4] HINT: In a moment you should be able to reconnect to the database and repeat your command. Jan 11 15:02:32 marshall postgres[67977]: [4-1] LOG: all server processes terminated; reinitializing Jan 11 15:02:32 marshall postgres[73888]: [5-1] LOG: database system was interrupted at 2007-01-11 15:02:22 WST Jan 11 15:02:32 marshall postgres[73888]: [6-1] LOG: checkpoint record is at 4D/AA7B784 Jan 11 15:02:32 marshall postgres[73888]: [7-1] LOG: redo record is at 4D/AA7B784; undo record is at 0/0; shutdown FALSE Jan 11 15:02:32 marshall postgres[73888]: [8-1] LOG: next transaction ID: 376382676; next OID: 2891876 Jan 11 15:02:32 marshall postgres[73888]: [9-1] LOG: next MultiXactId: 44140; next MultiXactOffset: 91044 Jan 11 15:02:32 marshall postgres[73888]: [10-1] LOG: database system was not properly shut down; automatic recovery in progress Jan 11 15:02:32 marshall postgres[73888]: [11-1] LOG: redo starts at 4D/AA7B7C8 Jan 11 15:02:32 marshall postgres[73889]: [5-1] FATAL: the database system is starting up Jan 11 15:02:32 marshall postgres[73892]: [5-1] FATAL: the database system is starting up Jan 11 15:02:39 marshall postgres[73909]: [5-1] FATAL: the database system is starting up Jan 11 15:02:40 marshall postgres[73888]: [12-1] PANIC: block 463 unfound Jan 11 15:02:41 marshall postgres[67977]: [5-1] LOG: startup process (PID 73888) was terminated by signal 6 Jan 11 15:02:41 marshall postgres[67977]: [6-1] LOG: aborting startup due to startup process failure Thanks in advance -- Warren Guy System Administrator CalorieKing - Australia Tel: +618.9389.8777 Fax: +618.9389.8444 warren.guy@calorieking.com www.calorieking.com
Re: PANIC: block 463 unfound during REDO after out of disk space failure during VACUUM
От
"Christopher Kings-Lynne"
Дата:
I'd just like to point out that Warren is a mate of mine :) I recall a time when a related issue occurred years ago: http://groups-beta.google.com/group/comp.databases.postgresql.hackers/browse_thread/thread/c97c853f640b9ac1/d6bc3c75eed6c2a4?q=could+not+access+status+of+transaction#d6bc3c75eed6c2a4 Not sure if it's a similar problem? Chris On 1/11/07, Warren Guy <warren.guy@calorieking.com> wrote: > Hi everyone > > Was running a VACUUM on a database on a partition which was running out > of disk space. During VACUUM the server process died and failed to restart. > > Running PostgreSQL 8.1.4 > > I basically want to get the system back up and running ASAP with as > little data loss as possible. All and any help is greatly appreciated. > > Here is output from error log: > > Jan 11 15:02:32 marshall postgres[71515]: [2-1] WARNING: terminating > connection because of crash of another server process > Jan 11 15:02:32 marshall postgres[71515]: [2-2] DETAIL: The postmaster > has commanded this server process to roll back the current transaction > and exit, because another server > Jan 11 15:02:32 marshall postgres[71515]: [2-3] process exited > abnormally and possibly corrupted shared memory. > Jan 11 15:02:32 marshall postgres[71515]: [2-4] HINT: In a moment you > should be able to reconnect to the database and repeat your command. > Jan 11 15:02:32 marshall postgres[67977]: [4-1] LOG: all server > processes terminated; reinitializing > Jan 11 15:02:32 marshall postgres[73888]: [5-1] LOG: database system > was interrupted at 2007-01-11 15:02:22 WST > Jan 11 15:02:32 marshall postgres[73888]: [6-1] LOG: checkpoint record > is at 4D/AA7B784 > Jan 11 15:02:32 marshall postgres[73888]: [7-1] LOG: redo record is at > 4D/AA7B784; undo record is at 0/0; shutdown FALSE > Jan 11 15:02:32 marshall postgres[73888]: [8-1] LOG: next transaction > ID: 376382676; next OID: 2891876 > Jan 11 15:02:32 marshall postgres[73888]: [9-1] LOG: next MultiXactId: > 44140; next MultiXactOffset: 91044 > Jan 11 15:02:32 marshall postgres[73888]: [10-1] LOG: database system > was not properly shut down; automatic recovery in progress > Jan 11 15:02:32 marshall postgres[73888]: [11-1] LOG: redo starts at > 4D/AA7B7C8 > Jan 11 15:02:32 marshall postgres[73889]: [5-1] FATAL: the database > system is starting up > Jan 11 15:02:32 marshall postgres[73892]: [5-1] FATAL: the database > system is starting up > Jan 11 15:02:39 marshall postgres[73909]: [5-1] FATAL: the database > system is starting up > Jan 11 15:02:40 marshall postgres[73888]: [12-1] PANIC: block 463 unfound > Jan 11 15:02:41 marshall postgres[67977]: [5-1] LOG: startup process > (PID 73888) was terminated by signal 6 > Jan 11 15:02:41 marshall postgres[67977]: [6-1] LOG: aborting startup > due to startup process failure > > > Thanks in advance > > -- > Warren Guy > > System Administrator > CalorieKing - Australia > Tel: +618.9389.8777 > Fax: +618.9389.8444 > warren.guy@calorieking.com > www.calorieking.com > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- Chris Kings-Lynne Director KKL Pty. Ltd. Biz: +61 8 9328 4780 Mob: +61 (0)409 294078 Web: www.kkl.com.au
Warren Guy wrote: > Hi everyone > > Was running a VACUUM on a database on a partition which was running out > of disk space. During VACUUM the server process died and failed to restart. > > Running PostgreSQL 8.1.4 ... > Jan 11 15:02:39 marshall postgres[73909]: [5-1] FATAL: the database > system is starting up > Jan 11 15:02:40 marshall postgres[73888]: [12-1] PANIC: block 463 unfound > Jan 11 15:02:41 marshall postgres[67977]: [5-1] LOG: startup process > (PID 73888) was terminated by signal 6 > Jan 11 15:02:41 marshall postgres[67977]: [6-1] LOG: aborting startup > due to startup process failure You say "was running out of disk space" - does that mean it did run out of disk space? I don't see the error that caused this, just the results. That would suggest to me that something unusual caused this (or you clipped the log fragment too far down :-) In any case, the first thing I'd try is to make your on-disk backups and set it up as though it's PITR recovery you're doing. That way you can stop the recovery before block 463 causes the failure. Oh, assuming you've got the space you need on your partition of course. HTH -- Richard Huxton Archonet Ltd
Btw -"unfound"?? I think the English there might need to be improved :) Chris On 1/11/07, Richard Huxton <dev@archonet.com> wrote: > Warren Guy wrote: > > Hi everyone > > > > Was running a VACUUM on a database on a partition which was running out > > of disk space. During VACUUM the server process died and failed to restart. > > > > Running PostgreSQL 8.1.4 > > ... > > Jan 11 15:02:39 marshall postgres[73909]: [5-1] FATAL: the database > > system is starting up > > Jan 11 15:02:40 marshall postgres[73888]: [12-1] PANIC: block 463 unfound > > Jan 11 15:02:41 marshall postgres[67977]: [5-1] LOG: startup process > > (PID 73888) was terminated by signal 6 > > Jan 11 15:02:41 marshall postgres[67977]: [6-1] LOG: aborting startup > > due to startup process failure > > You say "was running out of disk space" - does that mean it did run out > of disk space? I don't see the error that caused this, just the results. > That would suggest to me that something unusual caused this (or you > clipped the log fragment too far down :-) > > In any case, the first thing I'd try is to make your on-disk backups and > set it up as though it's PITR recovery you're doing. That way you can > stop the recovery before block 463 causes the failure. Oh, assuming > you've got the space you need on your partition of course. > > HTH > -- > Richard Huxton > Archonet Ltd > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- Chris Kings-Lynne Director KKL Pty. Ltd. Biz: +61 8 9328 4780 Mob: +61 (0)409 294078 Web: www.kkl.com.au