Re: BUG #16696: Backend crash in llvmjit

Поиск
Список
Период
Сортировка
От Dmitry Marakasov
Тема Re: BUG #16696: Backend crash in llvmjit
Дата
Msg-id 20201104212015.GA30304@hades.panopticon
обсуждение исходный текст
Ответ на Re: BUG #16696: Backend crash in llvmjit  ("Andres Freund" <andres@anarazel.de>)
Ответы Re: BUG #16696: Backend crash in llvmjit  (Dmitry Marakasov <amdmi3@amdmi3.ru>)
Re: BUG #16696: Backend crash in llvmjit  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
* Andres Freund (andres@anarazel.de) wrote:

> > > Environment details:
> > > - FreeBSD 12.1 amd64
> > > - PostgreSQL 13.0 (built from FreeBSD ports)
> > > - llvm-10.0.1 (build from FreeBSD ports)
> > 
> > My bad, it's actually llvm-9.0.1. Multiple llvm versions are installed on
> > the system, and PostgreSQL uses llvm9:
> > 
> > ldd /usr/local/lib/postgresql/llvmjit.so | grep LLVM
> >     libLLVM-9.so => /usr/local/llvm90/lib/libLLVM-9.so (0x800e00000)
> 
> Could you try generating a backtrace after turning jit_debugging_support on? That might give a bit more information.
> 
> I'll check once I'm home whether I can reproduce in my environment.

I did some digging. First of all, I've discovered that the problem
goes away if llvm bitcode optimization is disabled (by commenting out
llvm_optimize_module call).

I've dumped the opcode and tried compiling it back to match disassembly
of the failing function in gdb disassembly. It didn't match perfectly,
but this place looked similar:

# %bb.84:                               # %op.32.inputcall
    movq    %rax, 5267(%r13)
    movb    %bl, 5275(%r13)
    movb    $0, 5263(%r13)
    movzbl  (%rax), %esi
    movl    __mb_sb_limit(%rip), %edi
    movq    _ThreadRuneLocale@GOTTPOFF(%rip), %rcx
    movq    %fs:0, %rdx
    movq    (%rdx,%rcx), %rcx
    cmpl    %esi, %edi
    movq    %rax, -96(%rbp)         # 8-byte Spill
    movl    %edi, -72(%rbp)         # 4-byte Spill
    movq    %rcx, -64(%rbp)         # 8-byte Spill
jle     .LBB1_85

Here's my hypothesis:

The problem happens when boolin() function is inlined by LLVM.
The named function calls isspace() internally, which on FreeBSD is
locale-specific and involves caching some locale parameters in
thread-local variable defined as

extern _Thread_local const _RuneLocale *_ThreadRuneLocale;

The execution crashes on trying to access the named thread-local varible,
probably because something related to TLS is not set up properly in/for
LLVM.

I've confirmed this hypothesis by disabling isspace() calls in boolin()
which has also fixed the problem.

-- 
Dmitry Marakasov   .   55B5 0596 FF1E 8D84 5F56  9510 D35A 80DD F9D2 F77D
amdmi3@amdmi3.ru  ..:              https://github.com/AMDmi3




В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: BUG #16700: Child table dependency loss after moving out of and back into the inheritance tree
Следующее
От: Dmitry Marakasov
Дата:
Сообщение: Re: BUG #16696: Backend crash in llvmjit