post-recovery amcheck expectations

Поиск
Список
Период
Сортировка
От Noah Misch
Тема post-recovery amcheck expectations
Дата
Msg-id 20231005025232.c7.nmisch@google.com
обсуждение исходный текст
Ответы Re: post-recovery amcheck expectations  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
Suppose we start with this nbtree (subset of a diagram from verify_nbtree.c):

 *               1
 *           /       \
 *        2     <->     3

We're deleting 2, the leftmost leaf under a leftmost internal page.  After the
MARK_PAGE_HALFDEAD record, the first downlink from 1 will lead to 3, which
still has a btpo_prev pointing to 2.  bt_index_parent_check() complains here:

        /* The first page we visit at the level should be leftmost */
        if (first && !BlockNumberIsValid(state->prevrightlink) && !P_LEFTMOST(opaque))
            ereport(ERROR,
                    (errcode(ERRCODE_INDEX_CORRUPTED),
                     errmsg("the first child of leftmost target page is not leftmost of its level in index \"%s\"",
                            RelationGetRelationName(state->rel)),
                     errdetail_internal("Target block=%u child block=%u target page lsn=%X/%X.",
                                        state->targetblock, blkno,
                                        LSN_FORMAT_ARGS(state->targetlsn))));

One can encounter this if recovery ends between a MARK_PAGE_HALFDEAD record
and its corresponding UNLINK_PAGE record.  See the attached test case.  The
index is actually fine in such a state, right?  I lean toward fixing this by
having amcheck scan left; if left links reach only half-dead or deleted pages,
that's as good as the present child block being P_LEFTMOST.  There's a
different error from bt_index_check(), and I've not yet studied how to fix
that:

  ERROR:  left link/right link pair in index "not_leftmost_pk" not in agreement
  DETAIL:  Block=0 left block=0 left link from block=4294967295.

Alternatively, one could view this as a need for the user to VACUUM between
recovery and amcheck.  The documentation could direct users to "VACUUM
(DISABLE_PAGE_SKIPPING off, INDEX_CLEANUP on, TRUNCATE off)" if not done since
last recovery.  Does anyone prefer that or some other alternative?

For some other amcheck expectations, the comments suggest reliance on the
bt_index_parent_check() ShareLock.  I haven't tried to make test cases for
them, but perhaps recovery can trick them the same way.  Examples:

  errmsg("downlink or sibling link points to deleted block in index \"%s\"",
  errmsg("block %u is not leftmost in index \"%s\"",
  errmsg("block %u is not true root in index \"%s\"",

Thanks,
nm

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Jon Erdman
Дата:
Сообщение: Good News Everyone! + feature proposal
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Add a new BGWORKER_BYPASS_ROLELOGINCHECK flag