Обсуждение: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

Поиск
Список
Период
Сортировка

13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Justin Pryzby
Дата:
An internal server aborted last night while running a maintenance script.  I
reproduced this easily running the crashing command in a loop, and verified
this is a live issue on REL_13_STABLE (dc6f2fb43).
REINDEX INDEX pg_class_tblspc_relfilenode_index

It looks like this crashed once before, and I didn't notice until now:
Fri Jun 26 22:30:29 CDT 2020: pg_shdescription: pg_toast.pg_toast_2396_index(reindex toast)...
psql: error: could not connect to server: server closed the connection unexpectedly

#3  0x0000000000afde98 in comparetup_index_btree (a=0x201fa58, b=0x201fa10, state=0x20147b0) at tuplesort.c:4251
4251            Assert(false);
(gdb) l
4246                    if (pos1 != pos2)
4247                            return (pos1 < pos2) ? -1 : 1;
4248            }
4249
4250            /* ItemPointer values should never be equal */
4251            Assert(false);
4252
4253            return 0;
4254    }
4255

#3  0x0000000000afde98 in comparetup_index_btree (a=0x201fa58, b=0x201fa10, state=0x20147b0) at tuplesort.c:4251
        sortKey = 0x2014d60
        tuple1 = 0x20189d8
        tuple2 = 0x2018898
        keysz = 2
        tupDes = 0x7f48996b3790
        equal_hasnull = false
        nkey = 3
        compare = 0
        datum1 = 67999603
        datum2 = 67999603
        isnull1 = false
        isnull2 = false
        __func__ = "comparetup_index_btree"

(gdb) p *tuple1
$2 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 43}, t_info = 16}
(gdb) p *tuple2
$3 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 43}, t_info = 16}

(gdb) p *sortKey
$5 = {ssup_cxt = 0x40, ssup_collation = 0, ssup_reverse = false, ssup_nulls_first = false, ssup_attno = 0, ssup_extra =
0x0,comparator = 0x7f7f7f7f7f7f7f7f, abbreviate = 127, abbrev_converter = 0x7f7f7f7f7f7f7f7f, 
 
  abbrev_abort = 0x7f7f7f7f7f7f7f7f, abbrev_full_comparator = 0x7f7f7f7f7f7f7f7f}

(gdb) p *tupDes
$6 = {natts = 2, tdtypeid = 0, tdtypmod = -1, tdrefcount = 1, constr = 0x0, attrs = 0x7f48996b37a8}

Core was generated by `postgres: postgres sentinel [local] REINDEX            '.

(gdb) bt
#0  0x00007f489853d1f7 in raise () from /lib64/libc.so.6
#1  0x00007f489853e8e8 in abort () from /lib64/libc.so.6
#2  0x0000000000aafff7 in ExceptionalCondition (conditionName=0xccd0dc "false", errorType=0xccc327 "FailedAssertion",
fileName=0xccc2fd"tuplesort.c", lineNumber=4251) at assert.c:67
 
#3  0x0000000000afde98 in comparetup_index_btree (a=0x201fa58, b=0x201fa10, state=0x20147b0) at tuplesort.c:4251
#4  0x0000000000af1d5e in qsort_tuple (a=0x201fa10, n=18, cmp_tuple=0xafcf21 <comparetup_index_btree>, state=0x20147b0)
atqsort_tuple.c:140
 
#5  0x0000000000af2104 in qsort_tuple (a=0x201f710, n=50, cmp_tuple=0xafcf21 <comparetup_index_btree>, state=0x20147b0)
atqsort_tuple.c:191
 
#6  0x0000000000af2104 in qsort_tuple (a=0x201cc38, n=544, cmp_tuple=0xafcf21 <comparetup_index_btree>,
state=0x20147b0)at qsort_tuple.c:191
 
#7  0x0000000000af8056 in tuplesort_sort_memtuples (state=0x20147b0) at tuplesort.c:3490
#8  0x0000000000af51a9 in tuplesort_performsort (state=0x20147b0) at tuplesort.c:1985
#9  0x0000000000529418 in _bt_leafbuild (btspool=0x1f784e0, btspool2=0x0) at nbtsort.c:553
#10 0x0000000000528f9c in btbuild (heap=0x1fb5758, index=0x7f48996b3460, indexInfo=0x1f77a48) at nbtsort.c:333
#11 0x00000000005adcb3 in index_build (heapRelation=0x1fb5758, indexRelation=0x7f48996b3460, indexInfo=0x1f77a48,
isreindex=true,parallel=true) at index.c:2903
 
#12 0x00000000005aec6b in reindex_index (indexId=3455, skip_constraint_checks=false, persistence=112 'p', options=2) at
index.c:3539
#13 0x0000000000692583 in ReindexIndex (indexRelation=0x1f54840, options=0, concurrent=false) at indexcmds.c:2466
#14 0x0000000000932e36 in standard_ProcessUtility (pstmt=0x1f54960, queryString=0x1f53d90 "REINDEX INDEX
pg_class_tblspc_relfilenode_index",context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1f54c40,
 
    qc=0x7ffde023bf80) at utility.c:929
#15 0x000000000093241f in ProcessUtility (pstmt=0x1f54960, queryString=0x1f53d90 "REINDEX INDEX
pg_class_tblspc_relfilenode_index",context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1f54c40,
qc=0x7ffde023bf80)
    at utility.c:524
#16 0x0000000000931288 in PortalRunUtility (portal=0x1fb5ac0, pstmt=0x1f54960, isTopLevel=true, setHoldSnapshot=false,
dest=0x1f54c40,qc=0x7ffde023bf80) at pquery.c:1157
 
#17 0x00000000009314a7 in PortalRunMulti (portal=0x1fb5ac0, isTopLevel=true, setHoldSnapshot=false, dest=0x1f54c40,
altdest=0x1f54c40,qc=0x7ffde023bf80) at pquery.c:1303
 
#18 0x00000000009309bc in PortalRun (portal=0x1fb5ac0, count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x1f54c40,altdest=0x1f54c40, qc=0x7ffde023bf80) at pquery.c:779
 
#19 0x000000000092ab5b in exec_simple_query (query_string=0x1f53d90 "REINDEX INDEX pg_class_tblspc_relfilenode_index")
atpostgres.c:1239
 
#20 0x000000000092eb82 in PostgresMain (argc=1, argv=0x1f7db80, dbname=0x1f509d8 "sentinel", username=0x1f7daa0
"pryzbyj")at postgres.c:4315
 
#21 0x000000000087f098 in BackendRun (port=0x1f75a80) at postmaster.c:4523
#22 0x000000000087e888 in BackendStartup (port=0x1f75a80) at postmaster.c:4215
#23 0x000000000087ae95 in ServerLoop () at postmaster.c:1727
#24 0x000000000087a76c in PostmasterMain (argc=5, argv=0x1f4e8e0) at postmaster.c:1400
#25 0x00000000007823f3 in main (argc=5, argv=0x1f4e8e0) at main.c:210

This appears to be an issue with d2d8a229b (Incremental Sort), so I will add
at: https://wiki.postgresql.org/wiki/PostgreSQL_13_Open_Items



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
James Coleman
Дата:
On Tue, Jul 28, 2020 at 11:10 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
>
> An internal server aborted last night while running a maintenance script.  I
> reproduced this easily running the crashing command in a loop, and verified
> this is a live issue on REL_13_STABLE (dc6f2fb43).
> REINDEX INDEX pg_class_tblspc_relfilenode_index
>
> It looks like this crashed once before, and I didn't notice until now:
> Fri Jun 26 22:30:29 CDT 2020: pg_shdescription: pg_toast.pg_toast_2396_index(reindex toast)...
> psql: error: could not connect to server: server closed the connection unexpectedly
>
> #3  0x0000000000afde98 in comparetup_index_btree (a=0x201fa58, b=0x201fa10, state=0x20147b0) at tuplesort.c:4251
> 4251            Assert(false);
> (gdb) l
> 4246                    if (pos1 != pos2)
> 4247                            return (pos1 < pos2) ? -1 : 1;
> 4248            }
> 4249
> 4250            /* ItemPointer values should never be equal */
> 4251            Assert(false);
> 4252
> 4253            return 0;
> 4254    }
> 4255
>
> #3  0x0000000000afde98 in comparetup_index_btree (a=0x201fa58, b=0x201fa10, state=0x20147b0) at tuplesort.c:4251
>         sortKey = 0x2014d60
>         tuple1 = 0x20189d8
>         tuple2 = 0x2018898
>         keysz = 2
>         tupDes = 0x7f48996b3790
>         equal_hasnull = false
>         nkey = 3
>         compare = 0
>         datum1 = 67999603
>         datum2 = 67999603
>         isnull1 = false
>         isnull2 = false
>         __func__ = "comparetup_index_btree"
>
> (gdb) p *tuple1
> $2 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 43}, t_info = 16}
> (gdb) p *tuple2
> $3 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 0}, ip_posid = 43}, t_info = 16}
>
> (gdb) p *sortKey
> $5 = {ssup_cxt = 0x40, ssup_collation = 0, ssup_reverse = false, ssup_nulls_first = false, ssup_attno = 0, ssup_extra
=0x0, comparator = 0x7f7f7f7f7f7f7f7f, abbreviate = 127, abbrev_converter = 0x7f7f7f7f7f7f7f7f,
 
>   abbrev_abort = 0x7f7f7f7f7f7f7f7f, abbrev_full_comparator = 0x7f7f7f7f7f7f7f7f}
>
> (gdb) p *tupDes
> $6 = {natts = 2, tdtypeid = 0, tdtypmod = -1, tdrefcount = 1, constr = 0x0, attrs = 0x7f48996b37a8}
>
> Core was generated by `postgres: postgres sentinel [local] REINDEX            '.
>
> (gdb) bt
> #0  0x00007f489853d1f7 in raise () from /lib64/libc.so.6
> #1  0x00007f489853e8e8 in abort () from /lib64/libc.so.6
> #2  0x0000000000aafff7 in ExceptionalCondition (conditionName=0xccd0dc "false", errorType=0xccc327 "FailedAssertion",
fileName=0xccc2fd"tuplesort.c", lineNumber=4251) at assert.c:67
 
> #3  0x0000000000afde98 in comparetup_index_btree (a=0x201fa58, b=0x201fa10, state=0x20147b0) at tuplesort.c:4251
> #4  0x0000000000af1d5e in qsort_tuple (a=0x201fa10, n=18, cmp_tuple=0xafcf21 <comparetup_index_btree>,
state=0x20147b0)at qsort_tuple.c:140
 
> #5  0x0000000000af2104 in qsort_tuple (a=0x201f710, n=50, cmp_tuple=0xafcf21 <comparetup_index_btree>,
state=0x20147b0)at qsort_tuple.c:191
 
> #6  0x0000000000af2104 in qsort_tuple (a=0x201cc38, n=544, cmp_tuple=0xafcf21 <comparetup_index_btree>,
state=0x20147b0)at qsort_tuple.c:191
 
> #7  0x0000000000af8056 in tuplesort_sort_memtuples (state=0x20147b0) at tuplesort.c:3490
> #8  0x0000000000af51a9 in tuplesort_performsort (state=0x20147b0) at tuplesort.c:1985
> #9  0x0000000000529418 in _bt_leafbuild (btspool=0x1f784e0, btspool2=0x0) at nbtsort.c:553
> #10 0x0000000000528f9c in btbuild (heap=0x1fb5758, index=0x7f48996b3460, indexInfo=0x1f77a48) at nbtsort.c:333
> #11 0x00000000005adcb3 in index_build (heapRelation=0x1fb5758, indexRelation=0x7f48996b3460, indexInfo=0x1f77a48,
isreindex=true,parallel=true) at index.c:2903
 
> #12 0x00000000005aec6b in reindex_index (indexId=3455, skip_constraint_checks=false, persistence=112 'p', options=2)
atindex.c:3539
 
> #13 0x0000000000692583 in ReindexIndex (indexRelation=0x1f54840, options=0, concurrent=false) at indexcmds.c:2466
> #14 0x0000000000932e36 in standard_ProcessUtility (pstmt=0x1f54960, queryString=0x1f53d90 "REINDEX INDEX
pg_class_tblspc_relfilenode_index",context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1f54c40,
 
>     qc=0x7ffde023bf80) at utility.c:929
> #15 0x000000000093241f in ProcessUtility (pstmt=0x1f54960, queryString=0x1f53d90 "REINDEX INDEX
pg_class_tblspc_relfilenode_index",context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, dest=0x1f54c40,
qc=0x7ffde023bf80)
>     at utility.c:524
> #16 0x0000000000931288 in PortalRunUtility (portal=0x1fb5ac0, pstmt=0x1f54960, isTopLevel=true,
setHoldSnapshot=false,dest=0x1f54c40, qc=0x7ffde023bf80) at pquery.c:1157
 
> #17 0x00000000009314a7 in PortalRunMulti (portal=0x1fb5ac0, isTopLevel=true, setHoldSnapshot=false, dest=0x1f54c40,
altdest=0x1f54c40,qc=0x7ffde023bf80) at pquery.c:1303
 
> #18 0x00000000009309bc in PortalRun (portal=0x1fb5ac0, count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x1f54c40,altdest=0x1f54c40, qc=0x7ffde023bf80) at pquery.c:779
 
> #19 0x000000000092ab5b in exec_simple_query (query_string=0x1f53d90 "REINDEX INDEX
pg_class_tblspc_relfilenode_index")at postgres.c:1239
 
> #20 0x000000000092eb82 in PostgresMain (argc=1, argv=0x1f7db80, dbname=0x1f509d8 "sentinel", username=0x1f7daa0
"pryzbyj")at postgres.c:4315
 
> #21 0x000000000087f098 in BackendRun (port=0x1f75a80) at postmaster.c:4523
> #22 0x000000000087e888 in BackendStartup (port=0x1f75a80) at postmaster.c:4215
> #23 0x000000000087ae95 in ServerLoop () at postmaster.c:1727
> #24 0x000000000087a76c in PostmasterMain (argc=5, argv=0x1f4e8e0) at postmaster.c:1400
> #25 0x00000000007823f3 in main (argc=5, argv=0x1f4e8e0) at main.c:210
>
> This appears to be an issue with d2d8a229b (Incremental Sort), so I will add
> at: https://wiki.postgresql.org/wiki/PostgreSQL_13_Open_Items

Is that assumption largely based on the incremental sort patch
refactoring tuplesort.c a bit? I haven't looked at it much at all, but
I'm wondering if the issue could also be related to the btree
duplicates changes in 13 given that we're looking at
comparetup_index_btree and the datums are equal.

James



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Justin Pryzby
Дата:
On Tue, Jul 28, 2020 at 11:40:14AM -0400, James Coleman wrote:
> > This appears to be an issue with d2d8a229b (Incremental Sort), so I will add
> > at: https://wiki.postgresql.org/wiki/PostgreSQL_13_Open_Items
> 
> Is that assumption largely based on the incremental sort patch
> refactoring tuplesort.c a bit? I haven't looked at it much at all, but
> I'm wondering if the issue could also be related to the btree
> duplicates changes in 13 given that we're looking at
> comparetup_index_btree and the datums are equal.

Good point.  I'd looked at something like this to come to my tentative
conclusion.

git log --stat origin/REL_12_STABLE..origin/REL_13_STABLE -- src/backend/utils/sort/*tuple*c

-- 
Justin



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Jul 28, 2020 at 8:40 AM James Coleman <jtc331@gmail.com> wrote:
> Is that assumption largely based on the incremental sort patch
> refactoring tuplesort.c a bit? I haven't looked at it much at all, but
> I'm wondering if the issue could also be related to the btree
> duplicates changes in 13 given that we're looking at
> comparetup_index_btree and the datums are equal.

It couldn't possibly be the deduplication patch. That didn't change
anything in tuplesort.c.

This is very likely to be related to incremental sort because it's a
use-after-free issue, which is the kind of thing that could only
really happen inside tuplesort.c. This is clear because some of the
variables have the tell-tale 0x7f7f7f pattern that we written by
CLOBBER_FREED_MEMORY builds when memory is freed.


-- 
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Jul 28, 2020 at 8:47 AM Peter Geoghegan <pg@bowt.ie> wrote:
> This is very likely to be related to incremental sort because it's a
> use-after-free issue, which is the kind of thing that could only
> really happen inside tuplesort.c. This is clear because some of the
> variables have the tell-tale 0x7f7f7f pattern that we written by
> CLOBBER_FREED_MEMORY builds when memory is freed.

Actually, I changed my mind. The pointer variable sortKey within
comparetup_index_btree() has just been incremented in a way that makes
it point past the end of allocated memory, without being dereferenced.
That part is fine.

I still don't think that it's deduplication, though, because at the
point of the crash we haven't even reached _bt_load() yet. That is, we
haven't reached nbtsort.c code that is specific to Postgres 13 yet
(and besides, catalog indexes don't use deduplication in practice).

I wrote the assertion that fails here with the bug that I fixed in
commit 4974d7f87e62a58e80c6524e49677cb25cc10e12 in mind specifically.
That was a bug that involved a scan that returned duplicate tuples due
to a problem in heapam_index_build_range_scan() or all of the
infrastructure that it depends on (directly and indirectly). I wonder
if it's something like that -- this is also a system catalog index.

-- 
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Jul 28, 2020 at 10:06 AM Peter Geoghegan <pg@bowt.ie> wrote:
> I wrote the assertion that fails here with the bug that I fixed in
> commit 4974d7f87e62a58e80c6524e49677cb25cc10e12 in mind specifically.
> That was a bug that involved a scan that returned duplicate tuples due
> to a problem in heapam_index_build_range_scan() or all of the
> infrastructure that it depends on (directly and indirectly). I wonder
> if it's something like that -- this is also a system catalog index.

It's starting to look more like that. I can reproduce the bug by
running the REINDEX in a tight loop while "make installcheck" runs. It
looks as if the two tuples passed to comparetup_index_btree() are
separate tuples that each point to the same heap TID.

I have an rr recording of this. It shouldn't take too long to figure
out what's going on...

-- 
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Jul 28, 2020 at 10:37 AM Peter Geoghegan <pg@bowt.ie> wrote:
> It's starting to look more like that. I can reproduce the bug by
> running the REINDEX in a tight loop while "make installcheck" runs. It
> looks as if the two tuples passed to comparetup_index_btree() are
> separate tuples that each point to the same heap TID.

Evidently this is an old bug. The assertion that fails was added to
Postgres 12, but that isn't significant. The invariant we see violated
here has nothing to do with any of my B-Tree work -- it would have
been reasonable to add the same assertion to Postgres 9.5.

If I add the same assertion to 9.5 now, I find that the same steps
reproduce the problem -- "REINDEX INDEX
pg_class_tblspc_relfilenode_index" run in a tight loop connected to
the regression database, concurrent with a "make installcheck".

I still don't know what's going on here, but I find it suspicious that
some relevant tuples go through the HEAPTUPLE_INSERT_IN_PROGRESS +
!TransactionIdIsCurrentTransactionId() path of
heapam_index_build_range_scan(). After all, if this wasn't a system
catalog index then we'd expect to see "concurrent insert in progress
within table \"%s\"" WARNING output.

-- 
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Jul 28, 2020 at 12:00 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I still don't know what's going on here, but I find it suspicious that
> some relevant tuples go through the HEAPTUPLE_INSERT_IN_PROGRESS +
> !TransactionIdIsCurrentTransactionId() path of
> heapam_index_build_range_scan(). After all, if this wasn't a system
> catalog index then we'd expect to see "concurrent insert in progress
> within table \"%s\"" WARNING output.

I also find it suspicious that I can no longer produce the assertion
failure once I force all non-unique system catalog indexes (such as
Justin's repro index, pg_class_tblspc_relfilenode_index) to go through
the HEAPTUPLE_INSERT_IN_PROGRESS +
!TransactionIdIsCurrentTransactionId() handling for unique indexes
shown here:

                        /*
                         * If we are performing uniqueness checks, indexing
                         * such a tuple could lead to a bogus uniqueness
                         * failure.  In that case we wait for the inserting
                         * transaction to finish and check again.
                         */
                        if (checking_uniqueness)
                        {
                            /*
                             * Must drop the lock on the buffer before we wait
                             */
                            LockBuffer(scan->rs_cbuf, BUFFER_LOCK_UNLOCK);
                            XactLockTableWait(xwait, heapRelation,
                                              &heapTuple->t_self,
                                              XLTW_InsertIndexUnique);
                            CHECK_FOR_INTERRUPTS();
                            goto recheck;
                        }

Commenting out "if (checking_uniqueness)" here at least *masks* the
bug. Seemingly by averting problems that the checking_uniqueness code
wasn't actually designed to solve. I imagine that this factor makes
the bug less serious in practice -- most system catalogs are unique
indexes.

Actually...was the code *originally* designed to avoid this issue?
Might that fact have been lost since HOT was first introduced? Commit
1ddc2703a93 changed some of the code in question to avoid deadlocks on
system catalogs with new-style VACUUM FULL. I wonder if it was a good
idea to not wait when we weren't checking_uniqueness following that
2010 commit, though -- we used to wait like this regardless of our
checking_uniqueness status.

(I understand that the real problem here may be the way that we can
release locks early for system catalogs, but let's start with
immediate causes.)

-- 
Peter Geoghegan



Peter Geoghegan <pg@bowt.ie> writes:
> I also find it suspicious that I can no longer produce the assertion
> failure once I force all non-unique system catalog indexes (such as
> Justin's repro index, pg_class_tblspc_relfilenode_index) to go through
> the HEAPTUPLE_INSERT_IN_PROGRESS +
> !TransactionIdIsCurrentTransactionId() handling for unique indexes
> shown here:

Hmm...

> Actually...was the code *originally* designed to avoid this issue?

No, I don't think so.  It was designed for the case of unique key X
being inserted immediately after a deletion of the same key.  The
deleted tuple is presumably not yet vacuumed-away, so the new tuple
should have a different TID.  In no case should we have multiple index
tuples pointing at the same TID; that would imply that somebody failed
to vacuuum away an old index entry before freeing up the heap TID.

Or, perhaps, REINDEX is somehow scanning the same TID twice, and
generating indeed-duplicate index entries?

            regards, tom lane



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Jul 28, 2020 at 1:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> No, I don't think so.  It was designed for the case of unique key X
> being inserted immediately after a deletion of the same key.  The
> deleted tuple is presumably not yet vacuumed-away, so the new tuple
> should have a different TID.  In no case should we have multiple index
> tuples pointing at the same TID; that would imply that somebody failed
> to vacuuum away an old index entry before freeing up the heap TID.

It looks like one HOT chain. I think cases where the
visibility/HeapTupleSatisfiesVacuum() stuff somehow gets confused
could result in the same heap TID (which is actually the HOT chain's
root TID) getting indexed twice.

> Or, perhaps, REINDEX is somehow scanning the same TID twice, and
> generating indeed-duplicate index entries?

It's 100% clear that that's what happens from my rr recording (kind
of). A conditional breakpoint in _bt_build_callback() clearly shows
that it gets called twice for the same TID value (twice in immediate
succession). The first time it gets called in the
!HeapTupleIsHeapOnlyTuple() path, the second time in the
HeapTupleIsHeapOnlyTuple() path (i.e. the path that uses the
root_offsets array).

I notice that the root tuple of the hot chain is marked HEAP_COMBOCID
(and xmin == xmax for the HOT chain tuple). The xmin for the successor
(which matches xmin and xmax for root tuple) exactly matches the
REINDEX/crashing session's OldestXmin.

-- 
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Jul 28, 2020 at 1:26 PM Peter Geoghegan <pg@bowt.ie> wrote:
> On Tue, Jul 28, 2020 at 1:04 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > No, I don't think so.  It was designed for the case of unique key X
> > being inserted immediately after a deletion of the same key.  The
> > deleted tuple is presumably not yet vacuumed-away, so the new tuple
> > should have a different TID.  In no case should we have multiple index
> > tuples pointing at the same TID; that would imply that somebody failed
> > to vacuuum away an old index entry before freeing up the heap TID.
>
> It looks like one HOT chain.

The fact remains that this function (originally known as
IndexBuildHeapScan(), now heapam_index_build_range_scan()) did not
care about whether or not the index is unique for about 3 years
(excluding the tupleIsAlive stuff, which was always there, even before
HOT). The original HOT commit (commit 282d2a03dd3) said nothing about
unique indexes in the relevant path (the HEAPTUPLE_INSERT_IN_PROGRESS
+ !TransactionIdIsCurrentTransactionId() "concurrent system catalog
insert" path). The need to wait here really did seem to be all about
not getting duplicate TIDs (i.e. respecting the basic HOT invariant)
back in 2007.

-- 
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Jul 28, 2020 at 2:46 PM Peter Geoghegan <pg@bowt.ie> wrote:
> The fact remains that this function (originally known as
> IndexBuildHeapScan(), now heapam_index_build_range_scan()) did not
> care about whether or not the index is unique for about 3 years
> (excluding the tupleIsAlive stuff, which was always there, even before
> HOT). The original HOT commit (commit 282d2a03dd3) said nothing about
> unique indexes in the relevant path (the HEAPTUPLE_INSERT_IN_PROGRESS
> + !TransactionIdIsCurrentTransactionId() "concurrent system catalog
> insert" path). The need to wait here really did seem to be all about
> not getting duplicate TIDs (i.e. respecting the basic HOT invariant)
> back in 2007.

I mentioned that the unique index aspect was added by commit 1ddc2703
in 2010 (the new-style VACUUM FULL deadlock commit that added the "if
(checking_uniqueness)" condition). Turns out that that had bugs that
were fixed in 2011's commit 520bcd9c9bb (at least I think so based on
a reading of the latter commit's commit message) -- though those were
in the DELETE_IN_PROGRESS case.

Perhaps 2011's commit 520bcd9c9bb missed similar
HEAPTUPLE_INSERT_IN_PROGRESS issues that manifest themselves within
Justin's test case now?

-- 
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Jul 28, 2020 at 3:09 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Perhaps 2011's commit 520bcd9c9bb missed similar
> HEAPTUPLE_INSERT_IN_PROGRESS issues that manifest themselves within
> Justin's test case now?

Any further thoughts on this, Tom?

This is clearly a live bug that we should fix before too long. I could
write the patch myself, but I would like to get your response to my
analysis before starting down that road.

--
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Tom Lane
Дата:
Peter Geoghegan <pg@bowt.ie> writes:
> Any further thoughts on this, Tom?

Sorry for slow response ... my ISP had an equipment failure that took
out my email service for most of a day.

> This is clearly a live bug that we should fix before too long. I could
> write the patch myself, but I would like to get your response to my
> analysis before starting down that road.

Yeah.  Looking at the code now, I note these relevant comments in
heapam_index_build_range_scan:

         * Also, although our opinions about tuple liveness could change while
         * we scan the page (due to concurrent transaction commits/aborts),
         * the chain root locations won't, so this info doesn't need to be
         * rebuilt after waiting for another transaction.
         *
         * Note the implied assumption that there is no more than one live
         * tuple per HOT-chain --- else we could create more than one index
         * entry pointing to the same root tuple.

The core of the issue seems to be that in the presence of concurrent
updates, we might not have a stable opinion of which entry of the HOT
chain is live, leading to trying to index multiple ones of them (using
the same root TID), which leads to the assertion failure.

Also relevant is 1ddc2703a93's commit-log comment that

    First, teach IndexBuildHeapScan to not wait for INSERT_IN_PROGRESS or
    DELETE_IN_PROGRESS tuples to commit unless the index build is checking
    uniqueness/exclusion constraints.  If it isn't, there's no harm in just
    indexing the in-doubt tuple.

I'm not sure if I was considering the HOT-chain case when I wrote that,
but "no harm" seems clearly wrong in that situation: indexing more than
one in-doubt chain member leads to having multiple index entries pointing
at the same HOT chain.  That could be really bad if they have distinct
index values (though we do not expect such a case to arise in a system
catalog, since broken HOT chains should never occur there).

>> Perhaps 2011's commit 520bcd9c9bb missed similar
>> HEAPTUPLE_INSERT_IN_PROGRESS issues that manifest themselves within
>> Justin's test case now?

In the light of this, it bothers me that the DELETE_IN_PROGRESS case
has an exception for HOT chains:

                        if (checking_uniqueness ||
                            HeapTupleIsHotUpdated(heapTuple))
                            // wait

while the INSERT_IN_PROGRESS case has none.  Simple symmetry
would suggest that the INSERT_IN_PROGRESS case should be

                        if (checking_uniqueness ||
                            HeapTupleIsHeapOnly(heapTuple))
                            // wait

but I'm not 100% convinced that that's right.

            regards, tom lane



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Peter Geoghegan
Дата:
On Tue, Aug 4, 2020 at 1:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The core of the issue seems to be that in the presence of concurrent
> updates, we might not have a stable opinion of which entry of the HOT
> chain is live, leading to trying to index multiple ones of them (using
> the same root TID), which leads to the assertion failure.

I agree with that assessment. FWIW, I believe that contrib/amcheck
will detect this issue on Postgres 12+. If it happened all that often
then we probably would have heard about it by now.

BTW, I backpatched the assertion that fails. All branches have it now.
It might not help, but it certainly can't hurt.

> I'm not sure if I was considering the HOT-chain case when I wrote that,
> but "no harm" seems clearly wrong in that situation: indexing more than
> one in-doubt chain member leads to having multiple index entries pointing
> at the same HOT chain.  That could be really bad if they have distinct
> index values (though we do not expect such a case to arise in a system
> catalog, since broken HOT chains should never occur there).

I think that it might accidentally be okay for those reasons, though I
have a hard time imagining that that's what you meant back then. I
doubt that the exact consequences of the problem will affect what the
fix looks like now, so this may be somewhat of an academic question.

> In the light of this, it bothers me that the DELETE_IN_PROGRESS case
> has an exception for HOT chains:
>
>                         if (checking_uniqueness ||
>                             HeapTupleIsHotUpdated(heapTuple))
>                             // wait
>
> while the INSERT_IN_PROGRESS case has none.  Simple symmetry
> would suggest that the INSERT_IN_PROGRESS case should be
>
>                         if (checking_uniqueness ||
>                             HeapTupleIsHeapOnly(heapTuple))
>                             // wait

I had exactly the same intuition.

> but I'm not 100% convinced that that's right.

Why doubt that explanation?

As I've said, it's clear that the original HOT commit imagined that
this wait business was all about avoiding confusion about which heap
tuple to index for the HOT chain -- nothing more or less than that.
The simplest explanation seems to be that 1ddc2703a93 missed that
subtlety. When some (though not all) of the problems came to light a
few years later, 520bcd9c9bb didn't go far enough. We know that
1ddc2703a93 got the DELETE_IN_PROGRESS stuff wrong -- why doubt that
it also got the INSERT_IN_PROGRESS stuff wrong?

--
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Tom Lane
Дата:
Peter Geoghegan <pg@bowt.ie> writes:
> On Tue, Aug 4, 2020 at 1:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> while the INSERT_IN_PROGRESS case has none.  Simple symmetry
>> would suggest that the INSERT_IN_PROGRESS case should be
>> 
>>     if (checking_uniqueness ||
>>         HeapTupleIsHeapOnly(heapTuple))
>>         // wait

> I had exactly the same intuition.

>> but I'm not 100% convinced that that's right.

> Why doubt that explanation?

First, it's not clear that this is an exact inverse, because
HeapTupleIsHotUpdated does more than check the HOT_UPDATED flag.
Second, I think there remains some doubt as to whether the
DELETE_IN_PROGRESS case is right either.  If we were forcing
a wait for *every* in-doubt HOT-chain element, not only non-last
ones (or non-first ones for the INSERT case, if we use the above
fix) then it'd be quite clear that we're safe.  But if we want
to keep the optimization then I think maybe closer analysis is
warranted.

            regards, tom lane



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Justin Pryzby
Дата:
I have nothing new to add, but wanted to point out this is still an issue.

This is on the v13 Opened Items list - for lack of anywhere else to put them, I
also added two other, unresolved issues.

https://wiki.postgresql.org/index.php?title=PostgreSQL_13_Open_Items&type=revision&diff=35624&oldid=35352

On Tue, Aug 04, 2020 at 06:00:34PM -0400, Tom Lane wrote:
> Peter Geoghegan <pg@bowt.ie> writes:
> > On Tue, Aug 4, 2020 at 1:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> while the INSERT_IN_PROGRESS case has none.  Simple symmetry
> >> would suggest that the INSERT_IN_PROGRESS case should be
> >> 
> >>     if (checking_uniqueness ||
> >>         HeapTupleIsHeapOnly(heapTuple))
> >>         // wait
> 
> > I had exactly the same intuition.
> 
> >> but I'm not 100% convinced that that's right.
> 
> > Why doubt that explanation?
> 
> First, it's not clear that this is an exact inverse, because
> HeapTupleIsHotUpdated does more than check the HOT_UPDATED flag.
> Second, I think there remains some doubt as to whether the
> DELETE_IN_PROGRESS case is right either.  If we were forcing
> a wait for *every* in-doubt HOT-chain element, not only non-last
> ones (or non-first ones for the INSERT case, if we use the above
> fix) then it'd be quite clear that we're safe.  But if we want
> to keep the optimization then I think maybe closer analysis is
> warranted.



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Tom Lane
Дата:
Justin Pryzby <pryzby@telsasoft.com> writes:
> I have nothing new to add, but wanted to point out this is still an issue.
> This is on the v13 Opened Items list - for lack of anywhere else to put them, I
> also added two other, unresolved issues.

It's probably time to make a v14 open items page, and move anything
you want to treat as a live issue to there.

            regards, tom lane



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
Robins Tharakan
Дата:
> Justin Pryzby <pryzby@telsasoft.com> writes:
> > I have nothing new to add, but wanted to point out this is still an issue.
> > This is on the v13 Opened Items list - for lack of anywhere else to put them, I
> > also added two other, unresolved issues.

Two minor things to add:
1. This issue is still reproducible on 15Beta2 (c1d033fcb5) - Backtrace here [2]
2. There was a mention that amcheck could throw up errors, but despite quickly
stopping the workload, I didn't find anything interesting [1].


(gdb) frame 3
#3  0x000055bf9fe496c8 in comparetup_index_btree (a=0x7f0a1a50ba80,
b=0x7f0a1a50b9d8, state=0x55bfa19ce960) at tuplesort.c:4454
4454            Assert(false);

(gdb) info locals
sortKey = 0x55bfa19cef10
tuple1 = 0x7f0a1a563ee0
tuple2 = 0x7f0a1a5642a0
keysz = 2
tupDes = 0x7f0a1a9e66a8
equal_hasnull = false
nkey = 3
compare = 0
datum1 = 2085305
datum2 = 2085305
isnull1 = false
isnull2 = false
__func__ = "comparetup_index_btree"

(gdb) p *tuple1
$5 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 205}, ip_posid = 3}, t_info = 16}

(gdb) p *tuple2
$9 = {t_tid = {ip_blkid = {bi_hi = 0, bi_lo = 205}, ip_posid = 3}, t_info = 16}

(gdb) p *sortKey
$7 = {ssup_cxt = 0x40, ssup_collation = 0, ssup_reverse = false,
ssup_nulls_first = false, ssup_attno = 0, ssup_extra = 0x0, comparator
= 0x7f7f7f7f7f7f7f7f, abbreviate = 127,
  abbrev_converter = 0x7f7f7f7f7f7f7f7f, abbrev_abort =
0x7f7f7f7f7f7f7f7f, abbrev_full_comparator = 0x7f7f7f7f7f7f7f7f}

(gdb) p *tupDes
$8 = {natts = 2, tdtypeid = 2249, tdtypmod = -1, tdrefcount = 1,
constr = 0x0, attrs = 0x7f0a1a9e66c0}

Reference:
1) postgres=# select
bt_index_parent_check('pg_class_tblspc_relfilenode_index', true,
true);
 bt_index_parent_check
-----------------------

(1 row)

postgres=# select bt_index_check('pg_class_tblspc_relfilenode_index', true);
 bt_index_check
----------------

(1 row)


2) Backtrace -
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007f0a25692859 in __GI_abort () at abort.c:79
#2  0x000055bf9fdf1718 in ExceptionalCondition
(conditionName=0x55bfa0036a0b "false", errorType=0x55bfa0035d5e
"FailedAssertion", fileName=0x55bfa0035dbd "tuplesort.c",
lineNumber=4454)
    at assert.c:69
#3  0x000055bf9fe496c8 in comparetup_index_btree (a=0x7f0a1a50ba80,
b=0x7f0a1a50b9d8, state=0x55bfa19ce960) at tuplesort.c:4454
#4  0x000055bf9fe40901 in qsort_tuple (data=0x7f0a1a50b8e8, n=13,
compare=0x55bf9fe484ab <comparetup_index_btree>, arg=0x55bfa19ce960)
at ../../../../src/include/lib/sort_template.h:341
#5  0x000055bf9fe40b2f in qsort_tuple (data=0x7f0a1a50b3a8, n=61,
compare=0x55bf9fe484ab <comparetup_index_btree>, arg=0x55bfa19ce960)
at ../../../../src/include/lib/sort_template.h:378
#6  0x000055bf9fe40b9f in qsort_tuple (data=0x7f0a1a509e78, n=343,
compare=0x55bf9fe484ab <comparetup_index_btree>, arg=0x55bfa19ce960)
at ../../../../src/include/lib/sort_template.h:392
#7  0x000055bf9fe40b2f in qsort_tuple (data=0x7f0a1a509e78, n=833,
compare=0x55bf9fe484ab <comparetup_index_btree>, arg=0x55bfa19ce960)
at ../../../../src/include/lib/sort_template.h:378
#8  0x000055bf9fe40b9f in qsort_tuple (data=0x7f0a1a4d9050, n=2118,
compare=0x55bf9fe484ab <comparetup_index_btree>, arg=0x55bfa19ce960)
at ../../../../src/include/lib/sort_template.h:392
#9  0x000055bf9fe46df8 in tuplesort_sort_memtuples
(state=0x55bfa19ce960) at tuplesort.c:3698
#10 0x000055bf9fe44043 in tuplesort_performsort (state=0x55bfa19ce960)
at tuplesort.c:2217
#11 0x000055bf9f783a85 in _bt_leafbuild (btspool=0x55bfa1913318,
btspool2=0x0) at nbtsort.c:559
#12 0x000055bf9f7835a6 in btbuild (heap=0x7f0a1a9df940,
index=0x7f0a1a9e2898, indexInfo=0x55bfa19bc740) at nbtsort.c:336
#13 0x000055bf9f81c8cc in index_build (heapRelation=0x7f0a1a9df940,
indexRelation=0x7f0a1a9e2898, indexInfo=0x55bfa19bc740,
isreindex=true, parallel=true) at index.c:3018
#14 0x000055bf9f81dbe6 in reindex_index (indexId=3455,
skip_constraint_checks=false, persistence=112 'p',
params=0x7ffcfa60a524) at index.c:3718
#15 0x000055bf9f925148 in ReindexIndex (indexRelation=0x55bfa18f09a0,
params=0x7ffcfa60a598, isTopLevel=true) at indexcmds.c:2727
#16 0x000055bf9f924f67 in ExecReindex (pstate=0x55bfa1913070,
stmt=0x55bfa18f09f8, isTopLevel=true) at indexcmds.c:2651
#17 0x000055bf9fc3397f in standard_ProcessUtility
(pstmt=0x55bfa18f0d48, queryString=0x55bfa18eff30 "REINDEX INDEX
pg_class_tblspc_relfilenode_index;", readOnlyTree=false,
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x55bfa18f0e38, qc=0x7ffcfa60ad20) at utility.c:960
#18 0x00007f0a251d6887 in pgss_ProcessUtility (pstmt=0x55bfa18f0d48,
queryString=0x55bfa18eff30 "REINDEX INDEX
pg_class_tblspc_relfilenode_index;", readOnlyTree=false,
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x55bfa18f0e38, qc=0x7ffcfa60ad20) at pg_stat_statements.c:1143
#19 0x000055bf9fc32d34 in ProcessUtility (pstmt=0x55bfa18f0d48,
queryString=0x55bfa18eff30 "REINDEX INDEX
pg_class_tblspc_relfilenode_index;", readOnlyTree=false,
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0,
dest=0x55bfa18f0e38, qc=0x7ffcfa60ad20) at utility.c:526
#20 0x000055bf9fc3180e in PortalRunUtility (portal=0x55bfa197d020,
pstmt=0x55bfa18f0d48, isTopLevel=true, setHoldSnapshot=false,
dest=0x55bfa18f0e38, qc=0x7ffcfa60ad20) at pquery.c:1158
#21 0x000055bf9fc31a84 in PortalRunMulti (portal=0x55bfa197d020,
isTopLevel=true, setHoldSnapshot=false, dest=0x55bfa18f0e38,
altdest=0x55bfa18f0e38, qc=0x7ffcfa60ad20) at pquery.c:1315
#22 0x000055bf9fc30ef1 in PortalRun (portal=0x55bfa197d020,
count=9223372036854775807, isTopLevel=true, run_once=true,
dest=0x55bfa18f0e38, altdest=0x55bfa18f0e38, qc=0x7ffcfa60ad20)
    at pquery.c:791
#23 0x000055bf9fc2a14f in exec_simple_query
(query_string=0x55bfa18eff30 "REINDEX INDEX
pg_class_tblspc_relfilenode_index;") at postgres.c:1250
#24 0x000055bf9fc2ecdf in PostgresMain (dbname=0x55bfa1923be0
"postgres", username=0x55bfa18eb8f8 "ubuntu") at postgres.c:4544
#25 0x000055bf9fb52e93 in BackendRun (port=0x55bfa19218a0) at postmaster.c:4504
#26 0x000055bf9fb52778 in BackendStartup (port=0x55bfa19218a0) at
postmaster.c:4232
#27 0x000055bf9fb4ea5e in ServerLoop () at postmaster.c:1806
#28 0x000055bf9fb4e1f7 in PostmasterMain (argc=3, argv=0x55bfa18e9830)
at postmaster.c:1478
#29 0x000055bf9fa3f864 in main (argc=3, argv=0x55bfa18e9830) at main.c:202

-
Robins Tharakan
Amazon Web Services



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
"Andrey M. Borodin"
Дата:

> On 29 Jun 2022, at 17:43, Robins Tharakan <tharakan@gmail.com> wrote:


Sorry to bump ancient thread, I have some observations that might or might not be relevant.
Recently we noticed a corruption on one of clusters. The corruption at hand is not in system catalog, but in user
indexes.
The cluster was correctly configured: checksums, fsync, FPI etc.
The cluster never was restored from a backup. It’s a single-node cluster, so it was not ever promoted, pg_rewind-ed
etc.VM had never been rebooted. 

But, the cluster had been experiencing 10 OOMs a day. There were no torn pages, no checsum erros at log at all. Yet,
B-treeindexes became corrupted. 


Sorry for this wall of text, I’m posing everything as-is in case if there is some useful information.

$ /etc/cron.yandex/pg_corruption_check.py --index
2024-03-01 11:54:05,075 ERROR : Corrupted index: 96009 table1_table1message_table1_team_identity_06a95642 XX002 ERROR:
postinglist contains misplaced TID in index "table1_table1message_table1_team_identity_06a95642" DETAIL: Index
tid=(267,34)posting list offset=137 page lsn=31B/62159608. 
2024-03-01 11:54:05,100 ERROR : Corrupted index: 96008 table1_table1message_organization_id_66c18ed2 XX002 ERROR:
postinglist contains misplaced TID in index "table1_table1message_organization_id_66c18ed2" DETAIL: Index tid=(267,34)
postinglist offset=137 page lsn=31B/62158BC8. 
2024-03-01 11:54:05,355 ERROR : Corrupted index: 95804 table2_aler_channel_81aeec_idx XX002 ERROR: posting list
containsmisplaced TID in index "table2_aler_channel_81aeec_idx" DETAIL: Index tid=(336,7) posting list offset=182 page
lsn=314/9B794248.
2024-03-01 11:54:05,716 ERROR : Corrupted index: 95816 table2_table3_channel_id_91a1912f XX002 ERROR: posting list
containsmisplaced TID in index "table2_table3_channel_id_91a1912f" DETAIL: Index tid=(384,2) posting list offset=72
pagelsn=317/3F14F390. 
2024-03-01 11:54:06,068 ERROR : Corrupted index: 95815 table2_table3_channel_filter_id_6706c8b6 XX002 ERROR: posting
listcontains misplaced TID in index "table2_table3_channel_filter_id_6706c8b6" DETAIL: Index tid=(380,2) posting list
offset=72page lsn=317/3F0D8E30. 
2024-03-01 11:54:06,302 ERROR : Corrupted index: 95824 table2_table3_root_alert_group_id_f327f122 XX002 ERROR: item
orderinvariant violated for index "table2_table3_root_alert_group_id_f327f122" DETAIL: Lower index tid=(368,204)
(pointsto heap tid=(48901,2)) higher index tid=(368,205) (points to heap tid=(48901,2)) page lsn=319/3C234588. 
2024-03-01 11:54:06,538 ERROR : Corrupted index: 95810 table2_table3_acknowledged_by_user_id_dd6723dc XX002 ERROR:
postinglist contains misplaced TID in index "table2_table3_acknowledged_by_user_id_dd6723dc" DETAIL: Index tid=(380,69)
postinglist offset=35 page lsn=317/C14E2D50. 
2024-03-01 11:54:06,775 ERROR : Corrupted index: 95825 table2_table3_silenced_by_user_id_40a833a1 XX002 ERROR: posting
listcontains misplaced TID in index "table2_table3_silenced_by_user_id_40a833a1" DETAIL: Index tid=(371,11) posting
listoffset=144 page lsn=318/61171918. 
2024-03-01 11:54:07,009 ERROR : Corrupted index: 95829 table2_table3_wiped_by_id_4326ff61 XX002 ERROR: item order
invariantviolated for index "table2_table3_wiped_by_id_4326ff61" DETAIL: Lower index tid=(373,97) (points to heap
tid=(48901,2))higher index tid=(373,98) (points to heap tid=(48901,2)) page lsn=318/61172788. 
2024-03-01 11:54:07,245 ERROR : Corrupted index: 95823 table2_table3_resolved_by_user_id_463cdf3d XX002 ERROR: posting
listcontains misplaced TID in index "table2_table3_resolved_by_user_id_463cdf3d" DETAIL: Index tid=(375,89) posting
listoffset=144 page lsn=319/3C1DCFC8. 
2024-03-01 11:54:07,479 ERROR : Corrupted index: 95819 table2_table3_maintenance_uuid_9a7b8529_like XX002 ERROR: item
orderinvariant violated for index "table2_table3_maintenance_uuid_9a7b8529_like" DETAIL: Lower index tid=(372,4)
(pointsto heap tid=(48901,2)) higher index tid=(372,5) (points to heap tid=(48901,2)) page lsn=317/C1A210A8. 
2024-03-01 11:54:07,717 ERROR : Corrupted index: 95827 table2_table3_table1_message_id_58a31784_like XX002 ERROR:
postinglist contains misplaced TID in index "table2_table3_table1_message_id_58a31784_like" DETAIL: Index tid=(373,89)
postinglist offset=144 page lsn=319/3C3EE660. 
2024-03-01 11:54:08,162 ERROR : Corrupted index: 96066 webhooks_webhookresponse_webhook_id_db49ebcd XX002 ERROR: item
orderinvariant violated for index "webhooks_webhookresponse_webhook_id_db49ebcd" DETAIL: Lower index tid=(522,24)
(pointsto heap tid=(73981,1)) higher index tid=(522,25) (points to heap tid=(73981,1)) page lsn=31B/E522B640. 
2024-03-01 11:54:08,646 ERROR : Corrupted index: 95822 table2_table3_resolved_by_alert_id_bbdf0a83 XX002 ERROR: posting
listcontains misplaced TID in index "table2_table3_resolved_by_alert_id_bbdf0a83" DETAIL: Index tid=(618,2) posting
listoffset=150 page lsn=317/C1DE74B8. 
2024-03-01 11:54:08,873 ERROR : Corrupted index: 95427 table2_table3_table1_message_id_key XX002 ERROR: item order
invariantviolated for index "table2_table3_table1_message_id_key" DETAIL: Lower index tid=(369,134) (points to heap
tid=(48901,2))higher index tid=(369,135) (points to heap tid=(48901,2)) page lsn=319/3B629E58. 
2024-03-01 11:54:09,108 ERROR : Corrupted index: 95417 table2_table3_maintenance_uuid_key XX002 ERROR: posting list
containsmisplaced TID in index "table2_table3_maintenance_uuid_key" DETAIL: Index tid=(371,42) posting list offset=47
pagelsn=318/6116FC50. 
2024-03-01 11:54:10,180 ERROR : Corrupted index: 95826 table2_table3_table1_log_message_id_587aaa8d_like XX002 ERROR:
postinglist contains misplaced TID in index "table2_table3_table1_log_message_id_587aaa8d_like" DETAIL: Index
tid=(849,19)posting list offset=79 page lsn=319/3C389B60. 
2024-03-01 11:54:10,689 ERROR : Corrupted index: 95820 table2_table3_mattermost_log_message_id_69bc2ae4_like XX002
ERROR:item order invariant violated for index "table2_table3_mattermost_log_message_id_69bc2ae4_like" DETAIL: Lower
indextid=(559,4) (points to heap tid=(48901,2)) higher index tid=(559,5) (points to heap tid=(48901,2)) page
lsn=317/C1A7BA50.
2024-03-01 11:54:11,760 ERROR : Corrupted index: 95425 table2_table3_table1_log_message_id_key XX002 ERROR: item order
invariantviolated for index "table2_table3_table1_log_message_id_key" DETAIL: Lower index tid=(849,22) (points to heap
tid=(48901,2))higher index tid=(849,23) (points to heap tid=(48901,2)) page lsn=317/3E7EC1F0. 
2024-03-01 11:54:12,282 ERROR : Corrupted index: 95419 table2_table3_mattermost_log_message_id_key XX002 ERROR: posting
listcontains misplaced TID in index "table2_table3_mattermost_log_message_id_key" DETAIL: Index tid=(566,84) posting
listoffset=65 page lsn=319/3B1901F8. 
2024-03-01 11:54:17,990 ERROR : Corrupted index: 95423 table2_table3_public_primary_key_key XX002 ERROR: cross page
itemorder invariant violated for index "table2_table3_public_primary_key_key" DETAIL: Last item on page tid=(727,146)
pagelsn=31B/E104D660. 


Most of these messages look similar, except last one: “cross page item order invariant violated for index”. Indeed,
indexscans were hanging in a cycle. 
I could not locate problem in WAL yet, because a lot of other stuff is going on. But I have no other ideas, but suspect
thatposting list redo is corrupting index in case of a crash. 

Thanks!


Best regards, Andrey Borodin.


On Thu, Mar 21, 2024 at 2:16 AM Andrey M. Borodin <x4mmm@yandex-team.ru> wrote:
> Most of these messages look similar, except last one: “cross page item order invariant violated for index”. Indeed,
indexscans were hanging in a cycle. 
> I could not locate problem in WAL yet, because a lot of other stuff is going on. But I have no other ideas, but
suspectthat posting list redo is corrupting index in case of a crash. 

Some of these errors seem unrelated to posting lists. For example, this one:

2024-03-01 11:54:08,162 ERROR : Corrupted index: 96066
webhooks_webhookresponse_webhook_id_db49ebcd XX002 ERROR: item order
invariant violated for index
"webhooks_webhookresponse_webhook_id_db49ebcd" DETAIL: Lower index
tid=(522,24) (points to heap tid=(73981,1)) higher index tid=(522,25)
(points to heap tid=(73981,1)) page lsn=31B/E522B640.

Notice that there are duplicate heap TIDs here, but no posting list.
This is almost certainly a symptom of heap related corruption -- often
a problem with recovery. Do the posting lists that are corrupt
(reported on elsewhere) also have duplicate TIDs?

Such problems tend to first get noticed when inserts fail with
"posting list split failed" errors -- but that's just a symptom. It
just so happens that the hardening added to places like
_bt_swap_posting() and _bt_binsrch_insert() is much more likely to
visibly break than anything else, at least in practice.

--
Peter Geoghegan



Re: 13dev failed assert: comparetup_index_btree(): ItemPointer values should never be equal

От
"Andrey M. Borodin"
Дата:

> On 21 Mar 2024, at 18:54, Peter Geoghegan <pg@bowt.ie> wrote:
>  Do the posting lists that are corrupt
> (reported on elsewhere) also have duplicate TIDs?

I do not have access now, but AFAIR yes.
Thanks for pointers!

BTW there were also some errors in logs like
ERROR:  index "tablename" contains unexpected zero page at block 1265985 HINT:
and even
MultiXactId 34043703 has not been created yet -- apparent wraparound
"right sibling's left-link doesn't match: right sibling 4 of target 2 with leafblkno 2 and scanblkno 2 spuriously links
tonon-target 1 on level 0 of index "indexname" 

Multixact problem was fixed by vacuum freeze, other indexes were repacked.


> On 21 Mar 2024, at 20:21, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
>
> Would you happen to have a PostgreSQL version number (or commit hash)
> to help debugging? Has it always had that PG version, or has the
> version been upgraded?

Vanilla 14.11 (14.10 when created).

> Considering the age of this thread, and thus potential for v14 pre-.4:
> Did this cluster use REINDEX (concurrently) for the relevant indexes?


As now I see, chances are my case is not related to the original thread.

Thanks!


Best regards, Andrey Borodin.