Обсуждение: "failed to re-find parent key" question

Поиск
Список
Период
Сортировка

"failed to re-find parent key" question

От
"Roger Pan"
Дата:
Hi!
 
   We use postgreSQL 8.1.2 in Solaris 9 platform  to maintain very important business data. The postgresql DB was interrupted now:
 
> more postgresql-2007-03-05_210154.log
LOG:  could not bind socket for statistics collector: Cannot assign requested address
LOG:  database system was interrupted while in recovery at 2007-03-05 20:26:30 CST
HINT:  This probably means that some data is corrupted and you will have to use the last backup for recovery.
LOG:  checkpoint record is at 114/FDB86500
LOG:  redo record is at 114/FDB2B0F8; undo record is at 0/0; shutdown FALSE
LOG:  next transaction ID: 8817742; next OID: 106734149
LOG:  next MultiXactId: 60550; next MultiXactOffset: 14674685
LOG:  database system was not properly shut down; automatic recovery in progress
LOG:  redo starts at 114/FDB2B0F8
LOG:  record with zero length at 115/249891E8
LOG:  redo done at 115/249891B8
PANIC:  failed to re-find parent key in "1560660"
LOG:  startup process (PID 14266) was terminated by signal 6
LOG:  aborting startup due to startup process failure
LOG:  logger shutting down

 I know  the problem " failed to re-find parent key" has been fixed in the newer release. My question is how we can recover the data in this case? The difficult is the disk with postgres data system is full.

Many thanks!

Roger

Re: "failed to re-find parent key" question

От
Tom Lane
Дата:
"Roger Pan" <roger.pan@nexustelecom.com> writes:
>    We use postgreSQL 8.1.2 in Solaris 9 platform  to maintain very =
> important business data.

If it's as important as all that, you should make more of an effort to
keep up-to-date with PG minor releases...

> PANIC:  failed to re-find parent key in "1560660"
>  I know  the problem " failed to re-find parent key" has been fixed in =
> the newer release. My question is how we can recover the data in this =
> case? The difficult is the disk with postgres data system is full.=20

A quick and dirty solution would be to do pg_resetxlog, but the problem
is that it's difficult to predict how much corruption or data loss would
result.  If the data is really worth an effort to save, you might consider
making a hacked-up build in which this PANIC is reduced to a WARNING,
which you use just long enough to boot up and shut down.  I think it'd
work to change (in src/backend/access/nbtree/nbtinsert.c)

        /* Check for error only after writing children */
        if (pbuf == InvalidBuffer)
            elog(ERROR, "failed to re-find parent key in \"%s\"",
                 RelationGetRelationName(rel));

        /* Recursively update the parent */
        _bt_insertonpg(rel, pbuf, stack->bts_parent,
                       0, NULL, new_item, stack->bts_offset,
                       is_only);

to

        /* Check for error only after writing children */
        if (pbuf == InvalidBuffer)
            elog(WARNING, "failed to re-find parent key in \"%s\"",
                 RelationGetRelationName(rel));
        else
        /* Recursively update the parent */
        _bt_insertonpg(rel, pbuf, stack->bts_parent,
                       0, NULL, new_item, stack->bts_offset,
                       is_only);

After that, reboot into a standard postmaster, and reindex
the index(es) identified by the warning messages.

After that, think about an update ;-)

            regards, tom lane

PS: if you try this, I'd *strongly* suggest first making a
filesystem-level backup of all of the $PGDATA tree, so that you aren't
any worse off if it doesn't work.