Re: BUG #17245: Index corruption involving deduplicated entries

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: BUG #17245: Index corruption involving deduplicated entries
Дата
Msg-id 20211104220722.iyvpoyacqzvm2z5v@alap3.anarazel.de
обсуждение исходный текст
Ответ на Re: BUG #17245: Index corruption involving deduplicated entries  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: BUG #17245: Index corruption involving deduplicated entries  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-bugs
Hi,

On 2021-11-02 19:03:05 -0700, Peter Geoghegan wrote:
> Attached patch adds ERRORs in the event of detecting
> index-tuple-TID-points-to-LP_UNUSED conditions, as well as other
> similar conditions -- not just assertions, as before.
> 
> I do think that this would be a good idea on HEAD. Still have
> reservations about doing that for 14, but you're welcome to try and
> change my mind.

I'm not quite sure either - I'm worried about followon corruption based on the
bug. That'll be much more likely be detected with something like this in
place. But it's also easy to get some subtlety wrong...



> +static inline void
> +index_delete_check_htid(TM_IndexDeleteOp *delstate,
> +                        Page page, OffsetNumber maxoff,
> +                        ItemPointer htid, TM_IndexStatus *istatus)
> +{
> +    OffsetNumber indexpagehoffnum = ItemPointerGetOffsetNumber(htid);
> +    ItemId        iid;
> +    HeapTupleHeader htup;
> +
> +    Assert(OffsetNumberIsValid(istatus->idxoffnum));
> +
> +    if (indexpagehoffnum > maxoff)
> +        ereport(ERROR,
> +                (errcode(ERRCODE_INDEX_CORRUPTED),
> +                 errmsg_internal("heap tid from index tuple (%u,%u) points past end of heap page line pointer array
atoffset %u of block %u in index \"%s\"",
 
> +                                 ItemPointerGetBlockNumber(htid),
> +                                 indexpagehoffnum,
> +                                 istatus->idxoffnum, delstate->iblknum,
> +                                 RelationGetRelationName(delstate->irel))));
> +    iid = PageGetItemId(page, indexpagehoffnum);
> +    if (!ItemIdIsUsed(iid))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_INDEX_CORRUPTED),
> +                 errmsg_internal("heap tid from index tuple (%u,%u) points to unused heap page item at offset %u of
block%u in index \"%s\"",
 
> +                                 ItemPointerGetBlockNumber(htid),
> +                                 indexpagehoffnum,
> +                                 istatus->idxoffnum, delstate->iblknum,
> +                                 RelationGetRelationName(delstate->irel))));
> +
> +    if (ItemIdHasStorage(iid))
> +    {
> +        Assert(ItemIdIsNormal(iid));
> +        htup = (HeapTupleHeader) PageGetItem(page, iid);
> +
> +        if (HeapTupleHeaderIsHeapOnly(htup))
> +            ereport(ERROR,
> +                    (errcode(ERRCODE_INDEX_CORRUPTED),
> +                     errmsg_internal("heap tid from index tuple (%u,%u) points to heap-only tuple at offset %u of
block%u in index \"%s\"",
 
> +                                     ItemPointerGetBlockNumber(htid),
> +                                     indexpagehoffnum,
> +                                     istatus->idxoffnum, delstate->iblknum,
> +                                     RelationGetRelationName(delstate->irel))));
> +    }

I'd make the error paths unlikely().




> diff --git a/src/backend/access/heap/pruneheap.c b/src/backend/access/heap/pruneheap.c
> index db6912e9f..43549be04 100644
> --- a/src/backend/access/heap/pruneheap.c
> +++ b/src/backend/access/heap/pruneheap.c
> @@ -844,39 +844,115 @@ heap_page_prune_execute(Buffer buffer,
>  {
>      Page        page = (Page) BufferGetPage(buffer);
>      OffsetNumber *offnum;
> -    int            i;
> +    HeapTupleHeader htup PG_USED_FOR_ASSERTS_ONLY;
>  
>      /* Shouldn't be called unless there's something to do */
>      Assert(nredirected > 0 || ndead > 0 || nunused > 0);
>  
>      /* Update all redirected line pointers */
>      offnum = redirected;
> -    for (i = 0; i < nredirected; i++)
> +    for (int i = 0; i < nredirected; i++)
>      {
>          OffsetNumber fromoff = *offnum++;
>          OffsetNumber tooff = *offnum++;
>          ItemId        fromlp = PageGetItemId(page, fromoff);
> +        ItemId        tolp PG_USED_FOR_ASSERTS_ONLY;
> +
> +#ifdef USE_ASSERT_CHECKING

These I'd definitely only do in HEAD.


> +        /*
> +         * The existing items that we set as redirects must be the first tuple
> +         * of a HOT chain that has not be pruned before now (can't be a
> +         * heap-only tuple) when target item has tuple storage.  When the item
> +         * has no storage, then we must just be maintaining the LP_REDIRECT of
> +         * a HOT chain that has been pruned at least once before now.
> +         */
> +        if (!ItemIdIsRedirected(fromlp))
> +        {
> +            Assert(ItemIdHasStorage(fromlp) && ItemIdIsNormal(fromlp));
> +
> +            htup = (HeapTupleHeader) PageGetItem(page, fromlp);
> +            Assert(!HeapTupleHeaderIsHeapOnly(htup));
> +        }

I was a bit confused because I initially read this this as the first tuple
redirect *to* can't be a heap-only tuple. Perhaps this could be rephrased a
bit ("redirecting item"?).


Greetings,

Andres Freund



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17268: Possible corruption in toast index after reindex index concurrently
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: BUG #17245: Index corruption involving deduplicated entries