Re: Corrupted disk

Поиск
Список
Период
Сортировка
От Kevin Grittner
Тема Re: Corrupted disk
Дата
Msg-id 4D5F8509020000250003AD64@gw.wicourts.gov
обсуждение исходный текст
Ответ на Corrupted disk  (Tony Nelson <tnelson@starpoint.com>)
Список pgsql-admin
Tony Nelson  wrote:

> This morning my server (ubuntu 10.04 LTS, pg 8.4 from apt)
> experienced a disk corruption which caused errors like this:
>
> Feb 18 08:50:07 ihdb1 postgres[13317]: [2-1]
> user=postgres,db=instihire ERROR: invalid page header in block 10
> of relation base/16384/8082305

Do you know why?  If not, that machine is not to be trusted.

> I am ok, because I have a dump from last night, and wal since then.

That dump was taken according to PITR instructions, not from pg_dump
or pg_dumpall, I hope.

> As a test, I tried doing a dump like this from the broken server:
>
> pg_dump -Fc mydb > busted.dump
>
> I restored this dump on a test server, and all of the data looks
> ok, and my application is able to run the operation that was
> failing in production just fine.
>
> My question is simple, is my dump good?

Maybe; maybe not.  I would try to restore from that dump you
mentioned, run pg_dump on that, and compare the two files which came
out of pg_dump.  Or just use the dump if you can restore it and
you're sure it's up to date.

> Is it possible that pg_dump was able to correctly save the data
> that "select .. " and "update.. " couldn't.

pg_dump essentially just does a SELECT from each table.  If the
corrupted relation was an index, you might have dodged the problem
mentioned above, but if a page in that index got corrupted, why
assume damage was localized to that one place?

You really have two problems to address here: how best to recover
your data, and how to prevent a recurrence.

What version of PostgreSQL is this exactly?  (It's best to SELECT
version(); and copy/paste the results.)

-Kevin

В списке pgsql-admin по дате отправления:

Предыдущее
От: Tony Nelson
Дата:
Сообщение: Corrupted disk
Следующее
От: Selva manickaraja
Дата:
Сообщение: Standby function after fail-over