Re: BUG #17695: Failed Assert in logical replication snapbuild.

Поиск
Список
Период
Сортировка
От Alexander Lakhin
Тема Re: BUG #17695: Failed Assert in logical replication snapbuild.
Дата
Msg-id 7e4d4a80-3e3c-231f-f886-6cada2aa582b@gmail.com
обсуждение исходный текст
Ответ на Re: BUG #17695: Failed Assert in logical replication snapbuild.  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: BUG #17695: Failed Assert in logical replication snapbuild.  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-bugs
Hello Sawada-san,

17.05.2023 08:34, Masahiko Sawada wrote:
>
> When it comes to the original issue, I already shared the reproducible
> steps[4] and I've confirmed again with the steps that the issue still
> happens on 14 or later and the patch . However I don't find a way to
> reproduce it without sleep/gdb attach.

I can easily (without gdb and sleep()) reproduce the issue on master with
the following script:
numclients=10
rm -rf contrib/test_decoding_*
for ((c=1;c<=numclients;c++)); do
   cp -r contrib/test_decoding contrib/test_decoding_$c
done

for ((c=1;c<=numclients;c++)); do
   EXTRA_REGRESS_OPTS="--dbname=regress_$c" make -s installcheck-force -C contrib/test_decoding_$c USE_MODULE_DB=1 
 >"installcheck-$c.log" 2>&1 &
done
wait

It leads to:
TRAP: failed Assert("builder->next_phase_at == InvalidTransactionId"), File: "snapbuild.c", Line: 1628, PID: 907918
...
2023-05-18 16:23:33.290 MSK [907502] LOG:  server process (PID 907918) was terminated by signal 6: Aborted
2023-05-18 16:23:33.290 MSK [907502] DETAIL:  Failed process was running: SELECT count(*) FROM 
pg_logical_slot_get_changes('regression_slot_stats1', NULL, NULL, 'skip-empty-xacts', '1');

...
Core was generated by `postgres: postgres regress_10 [local] SELECT                  '.
Program terminated with signal SIGABRT, Aborted.

warning: Section `.reg-xstate/907918' in core file too small.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140405033059264) at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140405033059264) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140405033059264) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140405033059264, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007fb29a0cc476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007fb29a0b27f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x0000557371bd57bb in ExceptionalCondition (
     conditionName=conditionName@entry=0x557371d56860 "builder->next_phase_at == InvalidTransactionId",
     fileName=fileName@entry=0x557371d572e7 "snapbuild.c", lineNumber=lineNumber@entry=1628) at assert.c:66
#6  0x0000557371a28a29 in SnapBuildSerialize (builder=builder@entry=0x557372879158, lsn=lsn@entry=312723008)
     at snapbuild.c:1628
#7  0x0000557371a2a657 in SnapBuildProcessRunningXacts (builder=builder@entry=0x557372879158, lsn=312723008,
     running=running@entry=0x557373095190) at snapbuild.c:1230
...

If it would be helpful, I can reduce it to concrete sql queries.

Best regards,
Alexander



В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Clause accidentally pushed down ( Possible bug in Making Vars outer-join aware)
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: llvmjit.so: undefined symbol: LLVMBuildGEP Fedora 38