Re: BUG #10432: failed to re-find parent key in index
От | Heikki Linnakangas |
---|---|
Тема | Re: BUG #10432: failed to re-find parent key in index |
Дата | |
Msg-id | 538F0BBB.5000308@vmware.com обсуждение исходный текст |
Ответ на | Re: BUG #10432: failed to re-find parent key in index (Greg Stark <stark@mit.edu>) |
Список | pgsql-bugs |
On 06/04/2014 02:14 PM, Greg Stark wrote: > Ok, I made some progress. It turns out this was a pre-existing problem > in the master. They've been getting "failed to re-find parent" errors > for weeks. Far longer than I have any WAL or backups for. Bummer. You don't have logs reaching back to when the first error happened either, I presume. The most likely cause for originally failing to insert a downlink is running out of disk space when trying to split the parent page. (Even that's pretty unlikely, though.) > What I did find that was interesting is that this error basically made > the backups worthless. I could build a hot standby and connect and > query it. But as soon as recovery finished it would try to clean up > the incomplete split and fail. Because it had noticed the incomplete > split it had skipped every restartpoint and the next time I tried to > start it it insisted on restarting recovery from the beginning. If we > had been lucky enough not to do any page splits in the broken index > while the backup was being taken all would have been fine. But that > doesn't seem to have happened so all the backups were unrecoverable. It's only *incomplete* splits that happen while the backup is taken that matter. i.e. a page is split but inserting the downlink to the parent fails for some reason. If a page is split and the downlink is inserted successfully, that's OK. > So a few thoughts on how to improve things: > > 1) Failed to re-find parent should perhaps not be FATAL to recovery. > In fact any index replay error would really be nice not to have to > crash on. All crashing does is prevent the user from being able to > bring up their database and REINDEX the btree. This may be another use > case for the machinery that would protect against corrupt hash indexes > or user-defined indexes -- if we could mark the index invalid and > proceed (perhaps ignoring subsequent records for it) that would be > great. Yeah, that would be nice as an option. > 2) When we see an abort record we could check for any cleanup actions > triggered by that transaction and run them right away. I think the > checkpoints (and maybe hot standby snapshots or vacuum cleanup > records?) also include information about the oldest xid running, they > would also let us prune the cleanup actions sooner. That would at > least find the error sooner. In conjunction with (1) it would also > mean subsequent restartpoints would be effective instead of > suppressing restartpoints right to the end of recovery. You can't do any cleanup actions until you've recovered all the WAL. A cleanup action means inserting a tuple to the parent page, and you can't do that in the middle of recovery. But we could detect the case and give a warning. > 3) The lack of logs around an error during recovery makes it hard to > decipher what's going on. It would be nice to see "Beginning Xlog > cleanup (1 incomplete splits to replay)" and when it crashed "Last > safe point to restart recovery is 324/ABCDEF". As it was it was a > pretty big mystery why the database crashed, the logs made it appear > as if it had started up fine. And it was unclear why restarting it > caused it to replay from the beginning, I thought maybe something was > wrong with our scripts Yeah. All of this is gone in 9.4 anyway, but if you can come up with something non-invasive that can be back-patched... - Heikki
В списке pgsql-bugs по дате отправления: