Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM
Дата
Msg-id 11584.1509402845@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processingBRIN indexes in VACUUM  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Список pgsql-hackers
I wrote:
> Hmm.  The index offnum being complained of is one past the end of the
> lp array.  I think I see what about that commit changed the behavior:
> the old code for PageIndexDeleteNoCompact never changed the length
> of the lp array, except in the corner case where the page is becoming
> completely empty.  The new code will shorten the lp array (decrease
> phdr->pd_lower) if it's told to remove the last item.  So if you make
> two successive calls specifying the same offnum, and it's the last one
> on the page, the second one will fail with the symptoms we see here.
> However, so far as I can see, a sequence like that would have failed
> before too, just with a different error message, because once the
> first call had marked the item unused, the second call would not see
> it as a candidate to match.  So I'm not quite sure how that's related
> ... but it seems like it must be.

I'm still confused about why it didn't fail before, but after adding
some additional code to log each call of PageIndexTupleDeleteNoCompact,
I think I've got a smoking gun:

2017-10-30 18:18:44.321 EDT [10932] LOG:  deleting tuple 292 (of 292) in rel brin_test_c_idx page 2
2017-10-30 18:18:44.321 EDT [10932] STATEMENT:  vacuum brin_test
2017-10-30 18:18:44.393 EDT [10932] LOG:  deleting tuple 292 (of 292) in rel brin_test_d_idx page 2
2017-10-30 18:18:44.393 EDT [10932] STATEMENT:  vacuum brin_test
2017-10-30 18:18:53.428 EDT [10932] LOG:  deleting tuple 186 (of 186) in rel brin_test_e_idx page 3
2017-10-30 18:18:53.428 EDT [10932] STATEMENT:  vacuum brin_test
2017-10-30 18:19:13.794 EDT [10938] LOG:  deleting tuple 186 (of 186) in rel brin_test_e_idx page 4
2017-10-30 18:19:13.794 EDT [10938] STATEMENT:  insert into brin_test select
             mod(i,100000)/25,
             mod(i,100000)/25,
             mod(i,100000)/25,
             mod(i,100000)/25,
             md5((mod(i,100000)/25)::text)::uuid
    from generate_series(1,100000) s(i)
2017-10-30 18:19:13.795 EDT [10932] LOG:  deleting tuple 186 (of 185) in rel brin_test_e_idx page 4
2017-10-30 18:19:13.795 EDT [10932] STATEMENT:  vacuum brin_test
2017-10-30 18:19:13.795 EDT [10932] PANIC:  invalid index offnum: 186
2017-10-30 18:19:13.795 EDT [10932] STATEMENT:  vacuum brin_test

So what happened here is that the inserting process decided to
summarize concurrently with the VACUUM process, and the inserting
process deleted (or maybe just updated/moved?) the placeholder tuple
that VACUUM was expecting to delete, and then we get the PANIC because
the tuple we're expecting to delete is already gone.

So: I put the blame on the fact that summarize_range() thinks that
the tuple offset it has for the placeholder tuple is guaranteed to
hold good, even across possibly-long intervals where it's holding
no lock on the containing buffer.

Fixing this without creating new problems is beyond my familiarity
with the BRIN code.  It looks like it might be nontrivial :-(

            regards, tom lane

diff --git a/src/backend/access/brin/brin_pageops.c b/src/backend/access/brin/brin_pageops.c
index 80f803e..04ad804 100644
*** a/src/backend/access/brin/brin_pageops.c
--- b/src/backend/access/brin/brin_pageops.c
*************** brin_doupdate(Relation idxrel, BlockNumb
*** 243,249 ****
          if (extended)
              brin_page_init(BufferGetPage(newbuf), BRIN_PAGETYPE_REGULAR);

!         PageIndexTupleDeleteNoCompact(oldpage, oldoff);
          newoff = PageAddItem(newpage, (Item) newtup, newsz,
                               InvalidOffsetNumber, false, false);
          if (newoff == InvalidOffsetNumber)
--- 243,249 ----
          if (extended)
              brin_page_init(BufferGetPage(newbuf), BRIN_PAGETYPE_REGULAR);

!         PageIndexTupleDeleteNoCompact(idxrel, oldbuf, oldpage, oldoff);
          newoff = PageAddItem(newpage, (Item) newtup, newsz,
                               InvalidOffsetNumber, false, false);
          if (newoff == InvalidOffsetNumber)
diff --git a/src/backend/access/brin/brin_revmap.c b/src/backend/access/brin/brin_revmap.c
index 22f2076..4d5dad3 100644
*** a/src/backend/access/brin/brin_revmap.c
--- b/src/backend/access/brin/brin_revmap.c
*************** brinRevmapDesummarizeRange(Relation idxr
*** 409,415 ****
      ItemPointerSetInvalid(&invalidIptr);
      brinSetHeapBlockItemptr(revmapBuf, revmap->rm_pagesPerRange, heapBlk,
                              invalidIptr);
!     PageIndexTupleDeleteNoCompact(regPg, regOffset);
      /* XXX record free space in FSM? */

      MarkBufferDirty(regBuf);
--- 409,415 ----
      ItemPointerSetInvalid(&invalidIptr);
      brinSetHeapBlockItemptr(revmapBuf, revmap->rm_pagesPerRange, heapBlk,
                              invalidIptr);
!     PageIndexTupleDeleteNoCompact(idxrel, regBuf, regPg, regOffset);
      /* XXX record free space in FSM? */

      MarkBufferDirty(regBuf);
diff --git a/src/backend/access/brin/brin_xlog.c b/src/backend/access/brin/brin_xlog.c
index 60daa54..c8cc9f8 100644
*** a/src/backend/access/brin/brin_xlog.c
--- b/src/backend/access/brin/brin_xlog.c
*************** brin_xlog_update(XLogReaderState *record
*** 150,156 ****

          offnum = xlrec->oldOffnum;

!         PageIndexTupleDeleteNoCompact(page, offnum);

          PageSetLSN(page, lsn);
          MarkBufferDirty(buffer);
--- 150,156 ----

          offnum = xlrec->oldOffnum;

!         PageIndexTupleDeleteNoCompact(NULL, buffer, page, offnum);

          PageSetLSN(page, lsn);
          MarkBufferDirty(buffer);
*************** brin_xlog_desummarize_page(XLogReaderSta
*** 285,291 ****
      {
          Page        regPg = BufferGetPage(buffer);

!         PageIndexTupleDeleteNoCompact(regPg, xlrec->regOffset);

          PageSetLSN(regPg, lsn);
          MarkBufferDirty(buffer);
--- 285,291 ----
      {
          Page        regPg = BufferGetPage(buffer);

!         PageIndexTupleDeleteNoCompact(NULL, buffer, regPg, xlrec->regOffset);

          PageSetLSN(regPg, lsn);
          MarkBufferDirty(buffer);
diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 41642eb..20c0ada 100644
*** a/src/backend/storage/page/bufpage.c
--- b/src/backend/storage/page/bufpage.c
***************
*** 14,25 ****
--- 14,29 ----
   */
  #include "postgres.h"

+ #include<unistd.h>
+
  #include "access/htup_details.h"
  #include "access/itup.h"
  #include "access/xlog.h"
+ #include "storage/bufmgr.h"
  #include "storage/checksum.h"
  #include "utils/memdebug.h"
  #include "utils/memutils.h"
+ #include "utils/rel.h"


  /* GUC variable */
*************** PageIndexMultiDelete(Page page, OffsetNu
*** 955,961 ****
   * remain unchanged, and are willing to allow unused line pointers instead.
   */
  void
! PageIndexTupleDeleteNoCompact(Page page, OffsetNumber offnum)
  {
      PageHeader    phdr = (PageHeader) page;
      char       *addr;
--- 959,966 ----
   * remain unchanged, and are willing to allow unused line pointers instead.
   */
  void
! PageIndexTupleDeleteNoCompact(Relation rel, Buffer buf,
!                               Page page, OffsetNumber offnum)
  {
      PageHeader    phdr = (PageHeader) page;
      char       *addr;
*************** PageIndexTupleDeleteNoCompact(Page page,
*** 978,983 ****
--- 983,996 ----
                          phdr->pd_lower, phdr->pd_upper, phdr->pd_special)));

      nline = PageGetMaxOffsetNumber(page);
+
+     Assert(page == BufferGetPage(buf));
+
+     elog(LOG, "deleting tuple %d (of %d) in rel %s page %u",
+          offnum, nline,
+          rel ? RelationGetRelationName(rel) : "(unknown)",
+          BufferGetBlockNumber(buf));
+
      if ((int) offnum <= 0 || (int) offnum > nline)
          elog(ERROR, "invalid index offnum: %u", offnum);

diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
index 50c72a3..656be22 100644
*** a/src/include/storage/bufpage.h
--- b/src/include/storage/bufpage.h
***************
*** 16,23 ****
--- 16,25 ----

  #include "access/xlogdefs.h"
  #include "storage/block.h"
+ #include "storage/buf.h"
  #include "storage/item.h"
  #include "storage/off.h"
+ #include "utils/relcache.h"

  /*
   * A postgres disk page is an abstraction layered on top of a postgres
*************** extern Size PageGetExactFreeSpace(Page p
*** 429,435 ****
  extern Size PageGetHeapFreeSpace(Page page);
  extern void PageIndexTupleDelete(Page page, OffsetNumber offset);
  extern void PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems);
! extern void PageIndexTupleDeleteNoCompact(Page page, OffsetNumber offset);
  extern bool PageIndexTupleOverwrite(Page page, OffsetNumber offnum,
                          Item newtup, Size newsize);
  extern char *PageSetChecksumCopy(Page page, BlockNumber blkno);
--- 431,438 ----
  extern Size PageGetHeapFreeSpace(Page page);
  extern void PageIndexTupleDelete(Page page, OffsetNumber offset);
  extern void PageIndexMultiDelete(Page page, OffsetNumber *itemnos, int nitems);
! extern void PageIndexTupleDeleteNoCompact(Relation rel, Buffer buf,
!                                           Page page, OffsetNumber offset);
  extern bool PageIndexTupleOverwrite(Page page, OffsetNumber offnum,
                          Item newtup, Size newsize);
  extern char *PageSetChecksumCopy(Page page, BlockNumber blkno);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM
Следующее
От: Badrul Chowdhury
Дата:
Сообщение: Re: [HACKERS] Re: protocol version negotiation (Re: LibpqPGRES_COPY_BOTH - version compatibility)