Обсуждение: Possible database corruption - urgent

Поиск

Список

Период

Сортировка

Possible database corruption - urgent

От

"Benjamin Krajmalnik"

Дата:

07 января 2013 г., 21:22:29

I have a situation where pg_xlog started growing until it filled up the disk drive.

I got alerted to the error and started investigating.

Checked the logs and I am seeing the following entry repeatedly:

2013-01-07 01:49:12 GMT ERROR: could not open file "base/16748/181979366_fsm": No such file or directory

2013-01-07 01:49:12 GMT CONTEXT: writing block 1 of relation base/16748/181979366_fsm

2013-01-07 01:49:12 GMT WARNING: could not write block 1 of base/16748/181979366_fsm

I checked the actual file system, and that file is indeed missing. 181979366 exists.

Is there a way to get the system back up and running?

I stopped the postmaster and am moving the pg_xlog directory to a partition which has room left in it, but I need to resolve this missing file problem

Re: Possible database corruption - urgent

От

"Benjamin Krajmalnik"

Дата:

07 января 2013 г., 21:31:34

I forgot to mention – PostgreSQL 9.0 – my apologies.

Can I just recreate the file using touch so it exists and then restart potgresql?

The system coredumped and was attempting to go intorecovery mode

2013-01-07 01:49:12 GMT ERROR: could not open file "base/16748/181979366_fsm": No such file or directory

2013-01-07 01:49:12 GMT CONTEXT: writing block 1 of relation base/16748/181979366_fsm

2013-01-07 01:49:12 GMT WARNING: could not write block 1 of base/16748/181979366_fsm

2013-01-07 01:49:12 GMT ERROR: could not open file "base/16748/181979366_fsm": No such file or directory

2013-01-07 01:49:12 GMT CONTEXT: writing block 1 of relation base/16748/181979366_fsm

2013-01-07 01:49:12 GMT WARNING: could not write block 1 of base/16748/181979366_fsm

From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Benjamin Krajmalnik
Sent: Monday, January 07, 2013 2:22 PM
To: pgsql-admin@postgresql.org
Subject: [ADMIN] Possible database corruption - urgent

I have a situation where pg_xlog started growing until it filled up the disk drive.

I got alerted to the error and started investigating.

Checked the logs and I am seeing the following entry repeatedly:

2013-01-07 01:49:12 GMT ERROR: could not open file "base/16748/181979366_fsm": No such file or directory

2013-01-07 01:49:12 GMT CONTEXT: writing block 1 of relation base/16748/181979366_fsm

2013-01-07 01:49:12 GMT WARNING: could not write block 1 of base/16748/181979366_fsm

I checked the actual file system, and that file is indeed missing. 181979366 exists.

Is there a way to get the system back up and running?

I stopped the postmaster and am moving the pg_xlog directory to a partition which has room left in it, but I need to resolve this missing file problem

Re: Possible database corruption - urgent

От

"Benjamin Krajmalnik"

Дата:

07 января 2013 г., 21:35:54

Sorry for the cut and paste error.

This is the log entry when the pg_xlog partition ran out of space:

2013-01-07 20:50:22 GMT [local]PANIC: could not write to file "pg_xlog/xlogtemp.49680": No space left on device

2013-01-07 20:50:22 GMT [local]STATEMENT: INSERT INTO tbltmptests (testhash, testtime, statusid, replytxt, replyval, groupid) V

2013-01-07 20:50:23 GMT LOG: server process (PID 49680) was terminated by signal 6: Abort trap

2013-01-07 20:50:23 GMT LOG: terminating any other active server processes

2013-01-07 20:50:23 GMT [local]WARNING: terminating connection because of crash of another server process

2013-01-07 20:50:23 GMT [local]DETAIL: The postmaster has commanded this server process to roll back the current transaction an

2013-01-07 20:50:23 GMT [local]HINT: In a moment you should be able to reconnect to the database and repeat your command.

2013-01-07 20:50:23 GMT [local]FATAL: the database system is in recovery mode

2013-01-07 20:50:23 GMT LOG: all server processes terminated; reinitializing

2013-01-07 20:50:24 GMT LOG: database system was interrupted; last known up at 2013-01-07 00:31:02 GMT

2013-01-07 20:50:24 GMT LOG: database system was not properly shut down; automatic recovery in progress

2013-01-07 20:50:24 GMT LOG: consistent recovery state reached at 52F/8CE57490

2013-01-07 20:50:24 GMT LOG: redo starts at 52F/7BABC118

2013-01-07 20:50:38 GMT [local]FATAL: the database system is in recovery mode

2013-01-07 20:50:53 GMT [local]FATAL: the database system is in recovery mode

2013-01-07 20:51:08 GMT [local]FATAL: the database system is in recovery mode

2013-01-07 20:51:24 GMT [local]FATAL: the database system is in recovery mode

2013-01-07 20:51:39 GMT [local]FATAL: the database system is in recovery mode

2013-01-07 20:51:54 GMT [local]FATAL: the database system is in recovery mode

From: Benjamin Krajmalnik
Sent: Monday, January 07, 2013 2:31 PM
To: Benjamin Krajmalnik; pgsql-admin@postgresql.org
Subject: RE: [ADMIN] Possible database corruption - urgent