On Wed, Jun 25, 2008 at 2:58 AM, Heikki Linnakangas <
heikki@enterprisedb.com> wrote:
Andrew Hammond wrote:
I found this error message in my log files repeatedly:
Error: failed to re-find parent key in "ledgerdetail_2008_03_idx2" for
deletion target page 64767
I though "hmm, that index looks broken. I'd better re-create it." So, I
dropped the index and then tried to create a new one to replace it. Which
completely locked up the backend that was running the CREATE TABLE. I ran
truss against the backend in question and it didn't register anything
(except signals 2 and 15 when I tried to cancel the query and kill the
backend respectively). I eventually had to restart the database to get the
CREATE INDEX process to go away (well, to release that big nasty lock).
What kind of an index is it? Does "SELECT COUNT(*) from <table>" work?
After the restart I did a count(*) and it worked. A little under 13m rows. So, sequential scans seem to work.
posting here in case there's interest in gathering some forensic data or a
clever suggetion about how I can recover this situation or even some ideas
about what's causing it.
Anyway, the current plan is to drop the table and reload it from backup. I'm
Yes, please take a filesystem-level backup right away to retain the evidence.
Well, I've already burned our downtime allowance for this month, but we do a regular PITR type backup which hopefully will be sufficient to replicate the problem.
Could you connect to the hung backend with gdb and get a stacktrace?
The backend is no longer hung (two restarts later). I'll try to reproduce this problem on my workstation (same binary, same OS, libraries etc) using the PITR dump.
Andrew