Re: "PANIC: could not open critical system index 2662" - twice

Поиск
Список
Период
Сортировка
On Sun, May 07, 2023 at 10:30:52PM +1200, Thomas Munro wrote:
> Bug-in-PostgreSQL explanations could include that we forgot it was
> dirty, or some backend wrote it out to the wrong file; but if we were
> forgetting something like permanent or dirty, would there be a more
> systematic failure?  Oh, it could require special rare timing if it is
> similar to 8a8661828's confusion about permanence level or otherwise
> somehow not setting BM_PERMANENT, but in the target blocks, so I think
> that'd require a checkpoint AND a crash.  It doesn't reproduce for me,
> but perhaps more unlucky ingredients are needed.
>
> Bug-in-OS/FS explanations could include that a whole lot of writes
> were mysteriously lost in some time window, so all those files still
> contain the zeroes we write first in smgrextend().  I guess this
> previously rare (previously limited to hash indexes?) use of sparse
> file hole-punching could be a factor in an it's-all-ZFS's-fault
> explanation:

Yes, you would need a bit of all that.

I can reproduce the same backtrace here.  That's just my usual laptop
with ext4, so this would be a Postgres bug.  First, here are the four
things running in parallel so as I can get a failure in loading a
critical index when connecting:
1) Create and drop a database with WAL_LOG as strategy and the
regression database as template:
while true; do
  createdb --template=regression --strategy=wal_log testdb;
  dropdb testdb;
done
2) Feeding more data to pg_class in the middle, while testing the
connection to the database created:
while true;
  do psql -c 'create table popo as select 1 as a;' regression > /dev/null 2>&1 ;
  psql testdb -c "select 1" > /dev/null 2>&1 ;
  psql -c 'drop table popo' regression > /dev/null 2>&1 ;
  psql testdb -c "select 1" > /dev/null 2>&1 ;
done;
3) Force some checkpoints:
while true; do psql -c 'checkpoint' > /dev/null 2>&1; sleep 4; done
4) Force a few crashes and recoveries:
while true ; do pg_ctl stop -m immediate ; pg_ctl start ; sleep 4 ; done

And here is one backtrace:
(gdb) bt
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at
./nptl/pthread_kill.c:44
#1  0x00007f8a395cad2f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x00007f8a3957bef2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007f8a39566472 in __GI_abort () at ./stdlib/abort.c:79
#4  0x000055f0b36b95ac in errfinish (filename=0x55f0b38d4f68 "relcache.c", lineno=4335, funcname=0x55f0b38d6500
<__func__.12>"load_critical_index") at elog.c:604 
#5  0x000055f0b36a9c64 in load_critical_index (indexoid=2662, heapoid=1259) at relcache.c:4335
#6  0x000055f0b36a9712 in RelationCacheInitializePhase3 () at relcache.c:4110
(gdb) up 5
#5  0x000055f0b36a9c64 in load_critical_index (indexoid=2662, heapoid=1259) at relcache.c:4335
4335            elog(PANIC, "could not open critical system index %u", indexoid);

You can also get failures with btree deduplication because of the CTAS
running above, by the way, but that's just the top of the iceberg once
the state is corrupted:
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at
./nptl/pthread_kill.c:44
#1  0x00007fcae38abd2f in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2  0x00007fcae385cef2 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007fcae3847472 in __GI_abort () at ./stdlib/abort.c:79
#4  0x00005556a809f584 in ExceptionalCondition (conditionName=0x5556a812b639 "false", fileName=0x5556a812abc8
"heapam.c",lineNumber=7882) at assert.c:66 
#5  0x00005556a79e13db in index_delete_sort_cmp (deltid1=0x5556a9682158, deltid2=0x7ffdb62c7088) at heapam.c:7882
#6  0x00005556a79e14f6 in index_delete_sort (delstate=0x7ffdb62c8350) at heapam.c:7923
#7  0x00005556a79e0cd0 in heap_index_delete_tuples (rel=0x7fcae0f1a9a8, delstate=0x7ffdb62c8350) at heapam.c:7580
#8  0x00005556a7a07be4 in table_index_delete_tuples (rel=0x7fcae0f1a9a8, delstate=0x7ffdb62c8350) at
../../../../src/include/access/tableam.h:1337
#9  0x00005556a7a0a7a3 in _bt_delitems_delete_check (rel=0x7fcae0f231c8, buf=179, heapRel=0x7fcae0f1a9a8,
delstate=0x7ffdb62c8350)at nbtpage.c:1541 
#10 0x00005556a7a07545 in _bt_simpledel_pass (rel=0x7fcae0f231c8, buffer=179, heapRel=0x7fcae0f1a9a8,
deletable=0x7ffdb62c8430,ndeletable=160, newitem=0x5556a9689260,  

Anyway, you would be able to see that b3e184a (pretty much the same as
15.2), without 8a86618 included.  Once 8a86618 is included, all these
steps are stable in the backend, at least here.  Or do we have some
low-hanging fruit with the WAL_LOG strategy?  That could always be
possible, of course, but that looks like the same issue to me, just
with a different symptom showing up.
--
Michael

Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Thomas Munro
Дата:
Сообщение: Re: "PANIC: could not open critical system index 2662" - twice
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: "PANIC: could not open critical system index 2662" - twice