Обсуждение: [patch] Support LLVM 7

Поиск
Список
Период
Сортировка

[patch] Support LLVM 7

От
Christoph Berg
Дата:
LLVM 7 landed in Debian unstable, this patch teaches ./configure to use
it. (General patch, not specific to Debian.)

Christoph

Вложения

Re: [patch] Support LLVM 7

От
Andres Freund
Дата:
Hi,

On 2018-09-12 14:45:17 +0200, Christoph Berg wrote:
> LLVM 7 landed in Debian unstable, this patch teaches ./configure to use
> it. (General patch, not specific to Debian.)

Thanks.  Yes, I think we should do that, especially because my patches
to add proper debugging and profiling support only landed in LLVM
7. Therefore I'm planning to add this to both v11 and master.   Unless
somebody protests?

Greetings,

Andres Freund


Re: [patch] Support LLVM 7

От
Christoph Berg
Дата:
Re: Andres Freund 2018-09-12 <20180912210338.h3vsss5lkuu26ua2@alap3.anarazel.de>
> Hi,
> 
> On 2018-09-12 14:45:17 +0200, Christoph Berg wrote:
> > LLVM 7 landed in Debian unstable, this patch teaches ./configure to use
> > it. (General patch, not specific to Debian.)
> 
> Thanks.  Yes, I think we should do that, especially because my patches
> to add proper debugging and profiling support only landed in LLVM
> 7. Therefore I'm planning to add this to both v11 and master.   Unless
> somebody protests?

I plan to switch postgresql-11.deb to LLVM 7 over the next days
because of the support for non-x86 architectures, so this should
definitely land in 11.

Christoph


Re: [patch] Support LLVM 7

От
Andres Freund
Дата:
On 2018-09-12 23:07:34 +0200, Christoph Berg wrote:
> Re: Andres Freund 2018-09-12 <20180912210338.h3vsss5lkuu26ua2@alap3.anarazel.de>
> > Hi,
> > 
> > On 2018-09-12 14:45:17 +0200, Christoph Berg wrote:
> > > LLVM 7 landed in Debian unstable, this patch teaches ./configure to use
> > > it. (General patch, not specific to Debian.)
> > 
> > Thanks.  Yes, I think we should do that, especially because my patches
> > to add proper debugging and profiling support only landed in LLVM
> > 7. Therefore I'm planning to add this to both v11 and master.   Unless
> > somebody protests?
> 
> I plan to switch postgresql-11.deb to LLVM 7 over the next days
> because of the support for non-x86 architectures, so this should
> definitely land in 11.

Pushed, thanks for the patch!

Greetings,

Andres Freund


Re: [patch] Support LLVM 7

От
Christoph Berg
Дата:
Re: To Andres Freund 2018-09-12 <20180912210734.GB5666@msg.df7cb.de>
> I plan to switch postgresql-11.deb to LLVM 7 over the next days
> because of the support for non-x86 architectures

I did an upload of postgresql-11 beta3 with llvm 7 enabled on the
architectures where it is available (or supposed to become available),
that is, on !alpha !hppa !hurd-i386 !ia64 !kfreebsd-amd64 !kfreebsd-i386 !m68k !sh4.

There are two failures:
https://buildd.debian.org/status/logs.php?pkg=postgresql-11&ver=11~beta3-2

sparc64 fails with a lot of these in the log:

FATAL:  fatal llvm error: Invalid data was encountered while parsing the file

powerpc (the old 32-bit variant) has a lot of "server closed the
connection unexpectedly" in the regression logs, and one SIGILL:

2018-09-15 10:49:25.052 UTC [26458] LOG:  server process (PID 26527) was terminated by signal 4: Illegal instruction
2018-09-15 10:49:25.052 UTC [26458] DETAIL:  Failed process was running: SELECT '' AS tf_12, BOOLTBL1.*, BOOLTBL2.*
       FROM BOOLTBL1, BOOLTBL2
       WHERE BOOLTBL2.f1 <> BOOLTBL1.f1;
2018-09-15 10:49:25.052 UTC [26458] LOG:  terminating any other active server processes

Both smell more like LLVM bugs rather than PostgreSQL, so I guess we
can dub that a success. I'll disable JIT on these architectures for
the next upload.

Christoph


Re: [patch] Support LLVM 7

От
Andres Freund
Дата:
Hi,

On 2018-09-16 09:48:34 +0200, Christoph Berg wrote:
> Re: To Andres Freund 2018-09-12 <20180912210734.GB5666@msg.df7cb.de>
> > I plan to switch postgresql-11.deb to LLVM 7 over the next days
> > because of the support for non-x86 architectures
> 
> I did an upload of postgresql-11 beta3 with llvm 7 enabled on the
> architectures where it is available (or supposed to become available),
> that is, on !alpha !hppa !hurd-i386 !ia64 !kfreebsd-amd64 !kfreebsd-i386 !m68k !sh4.

Cool.  No idea why kfreebsd is on that list, but that's not really a
postgres relevan concern...


> There are two failures:
> https://buildd.debian.org/status/logs.php?pkg=postgresql-11&ver=11~beta3-2
> 
> sparc64 fails with a lot of these in the log:
> 
> FATAL:  fatal llvm error: Invalid data was encountered while parsing the file

Hm, that sounds like a proper serious LLVM bug. If it can't read bitcode
files it wrote, something is seriously wrong.


> powerpc (the old 32-bit variant) has a lot of "server closed the
> connection unexpectedly" in the regression logs, and one SIGILL:
> 
> 2018-09-15 10:49:25.052 UTC [26458] LOG:  server process (PID 26527) was terminated by signal 4: Illegal instruction
> 2018-09-15 10:49:25.052 UTC [26458] DETAIL:  Failed process was running: SELECT '' AS tf_12, BOOLTBL1.*, BOOLTBL2.*
>        FROM BOOLTBL1, BOOLTBL2
>        WHERE BOOLTBL2.f1 <> BOOLTBL1.f1;
> 2018-09-15 10:49:25.052 UTC [26458] LOG:  terminating any other active server processes

Hm. Is there any chance to get a backtrace for this one?  This could,
although I think less likely so, also be a postgres issue
(e.g. generating code for the wrong microarch).


> Both smell more like LLVM bugs rather than PostgreSQL, so I guess we
> can dub that a success. I'll disable JIT on these architectures for
> the next upload.

Ok.

Greetings,

Andres Freund


Re: [patch] Support LLVM 7

От
Christoph Berg
Дата:
Re: Andres Freund 2018-09-20 <20180919222600.myk5nec6unhrj45k@alap3.anarazel.de>
> > I did an upload of postgresql-11 beta3 with llvm 7 enabled on the
> > architectures where it is available (or supposed to become available),
> > that is, on !alpha !hppa !hurd-i386 !ia64 !kfreebsd-amd64 !kfreebsd-i386 !m68k !sh4.
> 
> Cool.  No idea why kfreebsd is on that list, but that's not really a
> postgres relevan concern...

Because https://buildd.debian.org/status/package.php?p=llvm-toolchain-7
 -> cmake missing on kfreebsd-*
 -> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=905138
 -> maintainer says this should be fixed by cmake upstream first

> > powerpc (the old 32-bit variant) has a lot of "server closed the
> > connection unexpectedly" in the regression logs, and one SIGILL:
> > 
> > 2018-09-15 10:49:25.052 UTC [26458] LOG:  server process (PID 26527) was terminated by signal 4: Illegal
instruction
> > 2018-09-15 10:49:25.052 UTC [26458] DETAIL:  Failed process was running: SELECT '' AS tf_12, BOOLTBL1.*,
BOOLTBL2.*
> >        FROM BOOLTBL1, BOOLTBL2
> >        WHERE BOOLTBL2.f1 <> BOOLTBL1.f1;
> > 2018-09-15 10:49:25.052 UTC [26458] LOG:  terminating any other active server processes
> 
> Hm. Is there any chance to get a backtrace for this one?  This could,
> although I think less likely so, also be a postgres issue
> (e.g. generating code for the wrong microarch).

I'll see if I can find a porterbox to get a backtrace.


In the meantime, there's a third architecture where llvm itself
compiled, but explodes with PG11 - x32:

  SELECT '' AS tf_12, BOOLTBL1.*, BOOLTBL2.*
     FROM BOOLTBL1, BOOLTBL2
     WHERE BOOLTBL2.f1 <> BOOLTBL1.f1;
! FATAL:  fatal llvm error: Cannot select: 0x57a7ae60: ch,glue = X86ISD::CALL 0x57a7add0, 0x57a7af38, Register:i32
$edi,RegisterMask:Untyped, 0x57a7add0:1
 
!   0x57a7af38: i32 = X86ISD::Wrapper TargetGlobalAddress:i32<void (%struct.TupleTableSlot*)* @deform_0_1> 0
!     0x57a7aef0: i32 = TargetGlobalAddress<void (%struct.TupleTableSlot*)* @deform_0_1> 0
!   0x57a7ad88: i32 = Register $edi
!   0x57a7ae18: Untyped = RegisterMask
!   0x57a7add0: ch,glue = CopyToReg 0x57a7ad40, Register:i32 $edi, 0x57a7acb0
!     0x57a7ad88: i32 = Register $edi
!     0x57a7acb0: i32,ch = CopyFromReg 0x57a367ac, Register:i32 %27
!       0x57a7ac68: i32 = Register %27
! In function: evalexpr_0_0
! server closed the connection unexpectedly
!     This probably means the server terminated abnormally
!     before or while processing the request.
! connection to server was lost

https://buildd.debian.org/status/fetch.php?pkg=postgresql-11&arch=x32&ver=11~beta3-2&stamp=1537286634&raw=0

Christoph


Re: [patch] Support LLVM 7

От
Christoph Berg
Дата:
Re: To Andres Freund 2018-09-20 <20180920081044.GA16897@msg.df7cb.de>
> > > 2018-09-15 10:49:25.052 UTC [26458] DETAIL:  Failed process was running: SELECT '' AS tf_12, BOOLTBL1.*,
BOOLTBL2.*
> > >        FROM BOOLTBL1, BOOLTBL2
> > >        WHERE BOOLTBL2.f1 <> BOOLTBL1.f1;
> > > 2018-09-15 10:49:25.052 UTC [26458] LOG:  terminating any other active server processes
> >
> > Hm. Is there any chance to get a backtrace for this one?  This could,
> > although I think less likely so, also be a postgres issue
> > (e.g. generating code for the wrong microarch).
>
> I'll see if I can find a porterbox to get a backtrace.

32-bit powerpc, 11~beta3-2:

postgres=# set jit = off;
SET
postgres=# SELECT
postgres-#             ARRAY(SELECT f.i FROM (
postgres(#                 (SELECT d + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)
postgres(#                 UNION ALL
postgres(#                 (SELECT d + g.i FROM generate_series(0, 30, 5) d ORDER BY 1)
postgres(#             ) f(i)
postgres(#             ORDER BY f.i LIMIT 10)
postgres-#         FROM generate_series(1, 3) g(i);
            array
------------------------------
 {1,5,6,8,11,11,14,16,17,20}
 {2,6,7,9,12,12,15,17,18,21}
 {3,7,8,10,13,13,16,18,19,22}
(3 Zeilen)

postgres=# set jit = on;
SET
postgres=# SELECT
                             ARRAY(SELECT f.i FROM (
                                                  (SELECT d + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)
                                                                   UNION ALL
                                                                                    (SELECT d + g.i FROM
generate_series(0,30, 5) d ORDER BY 1)                                                                          ) f(i)

        ORDER BY f.i LIMIT 10)
                     FROM generate_series(1, 3) g(i); 


Program received signal SIGSEGV, Segmentation fault.
0xf4a20c18 in ?? ()
(gdb) bt f
#0  0xf4a20c18 in ?? ()
No symbol table info available.
#1  0xf4a20bc8 in ?? ()
No symbol table info available.
#2  0xf4a41b90 in ExecRunCompiledExpr (state=0x1a7515c, econtext=0x1a73dd0, isNull=0xffe17c2b)
    at ./build/../src/backend/jit/llvm/llvmjit_expr.c:2591
        cstate = <optimized out>
        func = 0xf4a20b5c
#3  0x00c2d39c in ExecEvalExprSwitchContext (isNull=0xffe17c2b, econtext=<optimized out>, state=0x1a7515c)
    at ./build/../src/include/executor/executor.h:303
        retDatum = <optimized out>
        oldContext = 0x19fe830
        retDatum = <optimized out>
        oldContext = <optimized out>
#4  ExecProject (projInfo=0x1a75158) at ./build/../src/include/executor/executor.h:337
        econtext = <optimized out>
        state = 0x1a7515c
        slot = 0x1a750c0
        isnull = 252
        econtext = <optimized out>
        state = <optimized out>
        slot = <optimized out>
        isnull = <optimized out>
#5  ExecScan (node=<optimized out>, accessMtd=accessMtd@entry=0xc3ce50 <FunctionNext>,
    recheckMtd=recheckMtd@entry=0xc3cdf0 <FunctionRecheck>) at ./build/../src/backend/executor/execScan.c:201
        slot = <optimized out>
        econtext = <optimized out>
        qual = 0x0
        projInfo = 0x1a75158
#6  0x00c3ce3c in ExecFunctionScan (pstate=<optimized out>) at ./build/../src/backend/executor/nodeFunctionscan.c:270
        node = <optimized out>
#7  0x00c2b280 in ExecProcNodeFirst (node=0x1a73d48) at ./build/../src/backend/executor/execProcnode.c:445
No locals.
#8  0x00c23058 in ExecProcNode (node=0x1a73d48) at ./build/../src/include/executor/executor.h:237
No locals.
#9  ExecutePlan (execute_once=<optimized out>, dest=0x1a1c218, direction=<optimized out>, numberTuples=<optimized out>,
    sendTuples=<optimized out>, operation=CMD_SELECT, use_parallel_mode=<optimized out>, planstate=0x1a73d48,
estate=0x19fe8c0)
    at ./build/../src/backend/executor/execMain.c:1721
        slot = <optimized out>
        current_tuple_count = 0
        slot = <optimized out>
        current_tuple_count = <optimized out>
#10 standard_ExecutorRun (queryDesc=0x1962250, direction=<optimized out>, count=<optimized out>,
execute_once=<optimizedout>) 
    at ./build/../src/backend/executor/execMain.c:362
        estate = 0x19fe8c0
        operation = CMD_SELECT
        dest = 0x1a1c218
        sendTuples = <optimized out>
        oldcontext = 0x19621c0
        __func__ = "standard_ExecutorRun"
#11 0x00c23284 in ExecutorRun (queryDesc=queryDesc@entry=0x1962250, direction=direction@entry=ForwardScanDirection,
    count=<optimized out>, execute_once=<optimized out>) at ./build/../src/backend/executor/execMain.c:305
No locals.
#12 0x00dcd3a0 in PortalRunSelect (portal=portal@entry=0x198d290, forward=forward@entry=true, count=0,
count@entry=2147483647,
    dest=dest@entry=0x1a1c218) at ./build/../src/backend/tcop/pquery.c:932
        queryDesc = 0x1962250
        direction = <optimized out>
        nprocessed = <optimized out>
        __func__ = "PortalRunSelect"
#13 0x00dcee7c in PortalRun (portal=portal@entry=0x198d290, count=count@entry=2147483647,
isTopLevel=isTopLevel@entry=true,
    run_once=run_once@entry=true, dest=dest@entry=0x1a1c218, altdest=altdest@entry=0x1a1c218,
    completionTag=completionTag@entry=0xffe1800c ".\363W\223\307g0@") at ./build/../src/backend/tcop/pquery.c:773
        save_exception_stack = 0xffe18160
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {-32516271, -1998838, 20072133, 0, 0, -1998838, 26945520, 27378200, 19171432,
19171820,
              2147483647, -1998836, 26483320, -1998836, 26792592, 2, 19171840, 26476288, 26792592, 19144432, 19171844,
671228962,
              0 <repeats 36 times>, -1, 27369928, 0, 0, -1, 0 <repeats 49 times>}, __mask_was_saved = 0, __saved_mask =
{__val= { 
                15993932, 19161904, 671228488, 4292968320, 26456480, 19162012, 26800800, 4292968336, 15993932,
4292968460,671228488, 
                26476288, 26476144, 19164812, 26800800, 4292968368, 16146884, 19164812, 26483320, 4292968384, 16145016,
2,0, 17767752, 
                26483304, 19087032, 2, 4292968400, 10863184, 19144432, 2, 4292968432}}}}
        result = <optimized out>
        nprocessed = <optimized out>
        saveTopTransactionResourceOwner = 0x1968a40
        saveTopTransactionContext = 0x19b27f0
        saveActivePortal = 0x0
        saveResourceOwner = 0x1968a40
        savePortalContext = 0x0
        saveMemoryContext = 0x19b27f0
        __func__ = "PortalRun"
#14 0x00dca4ec in exec_simple_query (
    query_string=0x193ff00 "SELECT\n", ' ' <repeats 12 times>, "ARRAY(SELECT f.i FROM (\n", ' ' <repeats 16 times>,
"(SELECTd + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)\n", ' ' <repeats 16 times>, "UNION ALL\n", ' ' <repeats 16
times>,"(SELECT d + g.i FROM generate_series(0"...) at ./build/../src/backend/tcop/postgres.c:1122 
        parsetree = 0x1941a50
        portal = 0x198d290
        snapshot_set = <optimized out>
        commandTag = <optimized out>
        completionTag =
".\363W\223\307g0@\000\307sl\000\000\000\002\000\000\000\000\000\000\000\002\000\000\000\001\001$\211\374\001\226n8\377\341\201\060\001\223\377\000\001$fh\000\000\001>\377\341\200`\000\364\264$\001$\211",
<incompletesequence \374> 
        querytree_list = <optimized out>
        plantree_list = <optimized out>
        receiver = 0x1a1c218
        format = 61
        dest = DestRemote
        oldcontext = 0x19b27f0
        parsetree_list = 0x1941a78
        parsetree_item = 0x1941a68
        save_log_statement_stats = <optimized out>
        was_logged = false
        use_implicit_block = <optimized out>
        msec_str =
".\363W\223\307g0@\000\307sl\000\000\000\002\000\000\000\000\000\000\000\002\000\000\000\001\001$\211",<incomplete
sequence\374> 
        __func__ = "exec_simple_query"
#15 0x00dcbfcc in PostgresMain (argc=<optimized out>, argv=argv@entry=0x1966e38, dbname=<optimized out>,
username=<optimizedout>) 
    at ./build/../src/backend/tcop/postgres.c:4153
        query_string = 0x193ff00 "SELECT\n", ' ' <repeats 12 times>, "ARRAY(SELECT f.i FROM (\n", ' ' <repeats 16
times>,"(SELECT d + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)\n", ' ' <repeats 16 times>, "UNION ALL\n", ' '
<repeats16 times>, "(SELECT d + g.i FROM generate_series(0"... 
        firstchar = 81
        input_message = {
          data = 0x193ff00 "SELECT\n", ' ' <repeats 12 times>, "ARRAY(SELECT f.i FROM (\n", ' ' <repeats 16 times>,
"(SELECTd + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)\n", ' ' <repeats 16 times>, "UNION ALL\n", ' ' <repeats 16
times>,"(SELECT d + g.i FROM generate_series(0"..., len = 318, maxlen = 1024, cursor = 318} 
        local_sigjmp_buf = {{__jmpbuf = {-32560447, 14233748, 20054449, 26609600, 19171644, 19545576, -1997596,
19545448,-1997608, 
              1537449090, 5, 1537449152, 0, 19171028, 1, 19171836, 26635832, 26635632, 19171840, 19143524, 19171836,
671097412,
              0 <repeats 40 times>, -1, 0 <repeats 49 times>}, __mask_was_saved = 1, __saved_mask = {__val = {0, 0, 4,
4292971016,15, 
                0, 4292971016, 4292969328, 0, 0, 4294967295, 0, 0, 19171648, 4294967295, 19171644, 224, 19143524,
26626880,4292969360, 
                13022924, 0, 4294967295, 0, 80, 4150968980, 26626880, 0, 13023288, 0, 0, 4292969424}}}}
        send_ready_for_query = false
        disable_idle_in_transaction_timeout = false
        __func__ = "PostgresMain"
#16 0x00d35ebc in BackendRun (port=0x1964b40) at ./build/../src/backend/postmaster/postmaster.c:4361
        ac = 1
        secs = 590764407
        usecs = 199338
        i = 1
        av = 0x1966e38
        maxac = <optimized out>
        av = <optimized out>
        maxac = <optimized out>
        ac = <optimized out>
        secs = <optimized out>
        usecs = <optimized out>
        i = <optimized out>
        __func__ = "BackendRun"
#17 BackendStartup (port=0x1964b40) at ./build/../src/backend/postmaster/postmaster.c:4033
        bn = 0x19607c0
        pid = <optimized out>
        bn = <optimized out>
        pid = <optimized out>
        __func__ = "BackendStartup"
        save_errno = <optimized out>
#18 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1706
        port = 0x1964b40
        i = <optimized out>
        rmask = {fds_bits = {16, 0 <repeats 31 times>}}
        selres = <optimized out>
        now = <optimized out>
        readmask = {fds_bits = {24, 0 <repeats 31 times>}}
        nSockets = 5
        last_lockfile_recheck_time = 1537449152
        last_touch_time = 1537449090
        __func__ = "ServerLoop"
#19 0x00d36bdc in PostmasterMain (argc=<optimized out>, argv=<optimized out>) at
./build/../src/backend/postmaster/postmaster.c:1379
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = <optimized out>
        i = <optimized out>
        output_config_variable = <optimized out>
        __func__ = "PostmasterMain"
#20 0x00a4e050 in main (argc=5, argv=0x193b0e0) at ./build/../src/backend/main/main.c:228
No locals.


Christoph


Re: [patch] Support LLVM 7

От
Andres Freund
Дата:
On 2018-09-20 15:18:14 +0200, Christoph Berg wrote:
> Re: To Andres Freund 2018-09-20 <20180920081044.GA16897@msg.df7cb.de>
> > > > 2018-09-15 10:49:25.052 UTC [26458] DETAIL:  Failed process was running: SELECT '' AS tf_12, BOOLTBL1.*,
BOOLTBL2.*
> > > >        FROM BOOLTBL1, BOOLTBL2
> > > >        WHERE BOOLTBL2.f1 <> BOOLTBL1.f1;
> > > > 2018-09-15 10:49:25.052 UTC [26458] LOG:  terminating any other active server processes
> > > 
> > > Hm. Is there any chance to get a backtrace for this one?  This could,
> > > although I think less likely so, also be a postgres issue
> > > (e.g. generating code for the wrong microarch).
> > 
> > I'll see if I can find a porterbox to get a backtrace.

Hm, this is pretty helpful.  Sorry to ask, but could you a) turn on
jit_debugging_support (connection start) b) jit_dump_bitcode.

Then reproduce again.  After that, it'd be helpful to get:
1) /proc/cpuinfo
2) the "newest" *.bc file from the data directory
3) a backtrace
4) gdb disassemble at the point of the error.

Greetings,

Andres Freund


Re: [patch] Support LLVM 7

От
Andres Freund
Дата:
Hi,

On 2018-09-20 10:10:44 +0200, Christoph Berg wrote:
> In the meantime, there's a third architecture where llvm itself
> compiled, but explodes with PG11 - x32:
> 
>   SELECT '' AS tf_12, BOOLTBL1.*, BOOLTBL2.*
>      FROM BOOLTBL1, BOOLTBL2
>      WHERE BOOLTBL2.f1 <> BOOLTBL1.f1;
> ! FATAL:  fatal llvm error: Cannot select: 0x57a7ae60: ch,glue = X86ISD::CALL 0x57a7add0, 0x57a7af38, Register:i32
$edi,RegisterMask:Untyped, 0x57a7add0:1
 
> !   0x57a7af38: i32 = X86ISD::Wrapper TargetGlobalAddress:i32<void (%struct.TupleTableSlot*)* @deform_0_1> 0
> !     0x57a7aef0: i32 = TargetGlobalAddress<void (%struct.TupleTableSlot*)* @deform_0_1> 0
> !   0x57a7ad88: i32 = Register $edi
> !   0x57a7ae18: Untyped = RegisterMask
> !   0x57a7add0: ch,glue = CopyToReg 0x57a7ad40, Register:i32 $edi, 0x57a7acb0
> !     0x57a7ad88: i32 = Register $edi
> !     0x57a7acb0: i32,ch = CopyFromReg 0x57a367ac, Register:i32 %27
> !       0x57a7ac68: i32 = Register %27
> ! In function: evalexpr_0_0
> ! server closed the connection unexpectedly
> !     This probably means the server terminated abnormally
> !     before or while processing the request.
> ! connection to server was lost
> 
> https://buildd.debian.org/status/fetch.php?pkg=postgresql-11&arch=x32&ver=11~beta3-2&stamp=1537286634&raw=0

That's pretty clearly an LLVM bug. Could you enable jit_dump_bitcode and
send the bitcode files (I assume there should be something like
<pid>.<generation>.bc and the same with <prefix>.optimized.bc) from the
data directory?

Not that I think x32 is a particularly popular database platform, but
LLVM clearly needs to be fixed independent of PG...

Greetings,

Andres Freund


Re: [patch] Support LLVM 7

От
Christoph Berg
Дата:
Re: Andres Freund 2018-09-20 <20180920173009.ywi5grbotl7um65p@alap3.anarazel.de>
> Hm, this is pretty helpful.  Sorry to ask, but could you a) turn on
> jit_debugging_support (connection start) b) jit_dump_bitcode.
> 
> Then reproduce again.  After that, it'd be helpful to get:
> 1) /proc/cpuinfo
> 2) the "newest" *.bc file from the data directory
> 3) a backtrace
> 4) gdb disassemble at the point of the error.

postgres=# show jit_debugging_support ;
 jit_debugging_support
-----------------------
 on
(1 Zeile)

postgres=# set jit = on;
SET
postgres=# set jit_dump_bitcode = on;
SET
postgres=# select pg_backend_pid();                                                              pg_backend_pid
----------------
          14414
(1 Zeile)

postgres=# SELECT
ARRAY(SELECTf.i FROM (                                                                             (SELECT d + g.i FROM
generate_series(4,30, 3) d ORDER BY 1)                                    UNION ALL
                                                 (SELECT d + g.i FROM generate_series(0, 30, 5) d ORDER BY 1)
                    ) f(i)
ORDERBY f.i LIMIT 10)                                                                      FROM generate_series(1, 3)
g(i);

(gdb) f 0
#0  0xf4a20c14 in evalexpr_0_15 ()
(gdb) disassemble
Dump of assembler code for function evalexpr_0_15:
   0xf4a20b58 <+0>:     mflr    r0
   0xf4a20b5c <+4>:     stw     r0,4(r1)
   0xf4a20b60 <+8>:     stwu    r1,-48(r1)
   0xf4a20b64 <+12>:    mr      r6,r4
   0xf4a20b68 <+16>:    mr      r7,r3
   0xf4a20b6c <+20>:    addi    r8,r3,8
   0xf4a20b70 <+24>:    addi    r9,r3,5
   0xf4a20b74 <+28>:    lwz     r4,4(r4)
   0xf4a20b78 <+32>:    lwz     r3,12(r3)
   0xf4a20b7c <+36>:    lwz     r10,28(r3)
   0xf4a20b80 <+40>:    lwz     r3,32(r3)
   0xf4a20b84 <+44>:    stw     r4,44(r1)
   0xf4a20b88 <+48>:    stw     r9,40(r1)
   0xf4a20b8c <+52>:    stw     r5,36(r1)
   0xf4a20b90 <+56>:    stw     r6,32(r1)
   0xf4a20b94 <+60>:    stw     r7,28(r1)
   0xf4a20b98 <+64>:    stw     r8,24(r1)
   0xf4a20b9c <+68>:    stw     r10,20(r1)
   0xf4a20ba0 <+72>:    stw     r3,16(r1)
   0xf4a20ba4 <+76>:    b       0xf4a20ba8 <evalexpr_0_15+80>
   0xf4a20ba8 <+80>:    lwz     r3,44(r1)
   0xf4a20bac <+84>:    lwz     r4,24(r3)
   0xf4a20bb0 <+88>:    cmplwi  r4,0
   0xf4a20bb4 <+92>:    bne     0xf4a20bc8 <evalexpr_0_15+112>
   0xf4a20bb8 <+96>:    b       0xf4a20bbc <evalexpr_0_15+100>
   0xf4a20bbc <+100>:   lwz     r3,44(r1)
   0xf4a20bc0 <+104>:   bl      0xf4a20c50 <deform_0_16>
   0xf4a20bc4 <+108>:   b       0xf4a20bc8 <evalexpr_0_15+112>
   0xf4a20bc8 <+112>:   lis     r3,423
   0xf4a20bcc <+116>:   ori     r4,r3,16744
   0xf4a20bd0 <+120>:   lwz     r3,28(r1)
   0xf4a20bd4 <+124>:   lwz     r5,32(r1)
   0xf4a20bd8 <+128>:   stfd    f5,1(r16)
   0xf4a20bdc <+132>:   b       0xf4a20be0 <evalexpr_0_15+136>
   0xf4a20be0 <+136>:   lwz     r3,24(r1)
   0xf4a20be4 <+140>:   lwz     r3,0(r3)
   0xf4a20be8 <+144>:   lwz     r4,40(r1)
   0xf4a20bec <+148>:   lbz     r5,0(r4)
   0xf4a20bf0 <+152>:   lwz     r6,20(r1)
   0xf4a20bf4 <+156>:   lwz     r7,16(r1)
   0xf4a20bf8 <+160>:   stb     r5,0(r7)
   0xf4a20bfc <+164>:   cmplwi  r5,0
   0xf4a20c00 <+168>:   stw     r3,12(r1)
   0xf4a20c04 <+172>:   stw     r6,8(r1)
   0xf4a20c08 <+176>:   bne     0xf4a20c24 <evalexpr_0_15+204>
   0xf4a20c0c <+180>:   b       0xf4a20c10 <evalexpr_0_15+184>
   0xf4a20c10 <+184>:   lwz     r3,12(r1)
=> 0xf4a20c14 <+188>:   .long 0xae800001
   0xf4a20c18 <+192>:   lwz     r4,8(r1)
   0xf4a20c1c <+196>:   stw     r3,0(r4)
   0xf4a20c20 <+200>:   b       0xf4a20c24 <evalexpr_0_15+204>
   0xf4a20c24 <+204>:   lwz     r3,24(r1)
   0xf4a20c28 <+208>:   lwz     r3,0(r3)
   0xf4a20c2c <+212>:   lwz     r4,40(r1)
   0xf4a20c30 <+216>:   lbz     r5,0(r4)
   0xf4a20c34 <+220>:   clrlwi  r5,r5,31
   0xf4a20c38 <+224>:   lwz     r6,36(r1)
   0xf4a20c3c <+228>:   stb     r5,0(r6)
   0xf4a20c40 <+232>:   lwz     r0,52(r1)
   0xf4a20c44 <+236>:   addi    r1,r1,48
   0xf4a20c48 <+240>:   mtlr    r0
   0xf4a20c4c <+244>:   blr
End of assembler dump.

Program received signal SIGSEGV, Segmentation fault.
0xf4a20c14 in evalexpr_0_15 ()
(gdb) bt f
#0  0xf4a20c14 in evalexpr_0_15 ()
No symbol table info available.
#1  0xf4a41b90 in ExecRunCompiledExpr (state=0x1a740bc, econtext=0x1a72e60, isNull=0xffe17c2b)
    at ./build/../src/backend/jit/llvm/llvmjit_expr.c:2591
        cstate = <optimized out>
        func = 0xf4a20b58 <evalexpr_0_15>
#2  0x00c2d39c in ExecEvalExprSwitchContext (isNull=0xffe17c2b, econtext=<optimized out>,
    state=0x1a740bc) at ./build/../src/include/executor/executor.h:303
        retDatum = <optimized out>
        oldContext = 0x1a06cc0
        retDatum = <optimized out>
        oldContext = <optimized out>
#3  ExecProject (projInfo=0x1a740b8) at ./build/../src/include/executor/executor.h:337
        econtext = <optimized out>
        state = 0x1a740bc
        slot = 0x1a74020
        isnull = 252
        econtext = <optimized out>
        state = <optimized out>
        slot = <optimized out>
        isnull = <optimized out>
#4  ExecScan (node=<optimized out>, accessMtd=accessMtd@entry=0xc3ce50 <FunctionNext>,
    recheckMtd=recheckMtd@entry=0xc3cdf0 <FunctionRecheck>)
    at ./build/../src/backend/executor/execScan.c:201
        slot = <optimized out>
        econtext = <optimized out>
        qual = 0x0
        projInfo = 0x1a740b8
#5  0x00c3ce3c in ExecFunctionScan (pstate=<optimized out>)
    at ./build/../src/backend/executor/nodeFunctionscan.c:270
        node = <optimized out>
#6  0x00c2b280 in ExecProcNodeFirst (node=0x1a72dd8)
    at ./build/../src/backend/executor/execProcnode.c:445
No locals.
#7  0x00c23058 in ExecProcNode (node=0x1a72dd8)
    at ./build/../src/include/executor/executor.h:237
No locals.
#8  ExecutePlan (execute_once=<optimized out>, dest=0x1a6a878, direction=<optimized out>,
    numberTuples=<optimized out>, sendTuples=<optimized out>, operation=CMD_SELECT,
    use_parallel_mode=<optimized out>, planstate=0x1a72dd8, estate=0x1a06d50)
    at ./build/../src/backend/executor/execMain.c:1721
        slot = <optimized out>
        current_tuple_count = 0
        slot = <optimized out>
        current_tuple_count = <optimized out>
#9  standard_ExecutorRun (queryDesc=0x1960d50, direction=<optimized out>,
    count=<optimized out>, execute_once=<optimized out>)
    at ./build/../src/backend/executor/execMain.c:362
        estate = 0x1a06d50
        operation = CMD_SELECT
        dest = 0x1a6a878
        sendTuples = <optimized out>
        oldcontext = 0x1960cc0
        __func__ = "standard_ExecutorRun"
#10 0x00c23284 in ExecutorRun (queryDesc=queryDesc@entry=0x1960d50,
    direction=direction@entry=ForwardScanDirection, count=<optimized out>,
    execute_once=<optimized out>) at ./build/../src/backend/executor/execMain.c:305
No locals.
#11 0x00dcd3a0 in PortalRunSelect (portal=portal@entry=0x198d890, forward=forward@entry=true,
    count=0, count@entry=2147483647, dest=dest@entry=0x1a6a878)
    at ./build/../src/backend/tcop/pquery.c:932
        queryDesc = 0x1960d50
        direction = <optimized out>
        nprocessed = <optimized out>
        __func__ = "PortalRunSelect"
#12 0x00dcee7c in PortalRun (portal=portal@entry=0x198d890, count=count@entry=2147483647,
    isTopLevel=isTopLevel@entry=true, run_once=run_once@entry=true, dest=dest@entry=0x1a6a878,
    altdest=altdest@entry=0x1a6a878,
    completionTag=completionTag@entry=0xffe1800c ".\363W\223\307g0@")
    at ./build/../src/backend/tcop/pquery.c:773
        save_exception_stack = 0xffe18160
        save_context_stack = 0x0
        local_sigjmp_buf = {{__jmpbuf = {-32516271, -1998838, 20072133, 0, 0, -1998838,
              27249760, 27699320, 19171432, 19171820, 2147483647, -1998836, 26483320,
              -1998836, 26794128, 2, 19171840, 26476288, 26794128, 19144432, 19171844,
              671228962, 0 <repeats 36 times>, -1, 27691048, 0, 0, -1, 0 <repeats 49 times>},
            __mask_was_saved = 0, __saved_mask = {__val = {15993932, 19161904, 671228488,
                4292968320, 26456480, 19162012, 26802336, 4292968336, 15993932, 4292968460,
                671228488, 26476288, 26476144, 19164812, 26802336, 4292968368, 16146884,
                19164812, 26483320, 4292968384, 16145016, 2, 0, 17767752, 26483304, 19087032,
                2, 4292968400, 10863184, 19144432, 2, 4292968432}}}}
        result = <optimized out>
        nprocessed = <optimized out>
        saveTopTransactionResourceOwner = 0x1968c88
        saveTopTransactionContext = 0x19fcc60
        saveActivePortal = 0x0
        saveResourceOwner = 0x1968c88
        savePortalContext = 0x0
        saveMemoryContext = 0x19fcc60
        __func__ = "PortalRun"
#13 0x00dca4ec in exec_simple_query (
    query_string=0x193ff00 "SELECT\n", ' ' <repeats 12 times>, "ARRAY(SELECT f.i FROM (\n", ' ' <repeats 16 times>,
"(SELECTd + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)\n", ' ' <repeats 16 times>, "UNION ALL\n", ' ' <repeats 16
times>,"(SELECT d + g.i FROM generate_series(0"...) at ./build/../src/backend/tcop/postgres.c:1122
 
        parsetree = 0x1941a50
        portal = 0x198d890
        snapshot_set = <optimized out>
        commandTag = <optimized out>
        completionTag =
".\363W\223\307g0@\000\307sl\000\000\000\002\000\000\000\001\000\000\000\001\000\000\000\001\001$\211\374\001\226tP\377\341\201\060\001\223\377\000\001$fh\000\000\001>\377\341\200`\000\364\264$\001$\211",
<incompletesequence \374>
 
        querytree_list = <optimized out>
        plantree_list = <optimized out>
        receiver = 0x1a6a878
        format = 61
        dest = DestRemote
        oldcontext = 0x19fcc60
        parsetree_list = 0x1941a78
        parsetree_item = 0x1941a68
        save_log_statement_stats = <optimized out>
        was_logged = false
        use_implicit_block = <optimized out>
        msec_str =
".\363W\223\307g0@\000\307sl\000\000\000\002\000\000\000\001\000\000\000\001\000\000\000\001\001$\211",<incomplete
sequence\374>
 
        __func__ = "exec_simple_query"
#14 0x00dcbfcc in PostgresMain (argc=<optimized out>, argv=argv@entry=0x1967450,
    dbname=<optimized out>, username=<optimized out>)
    at ./build/../src/backend/tcop/postgres.c:4153
        query_string = 0x193ff00 "SELECT\n", ' ' <repeats 12 times>, "ARRAY(SELECT f.i FROM (\n", ' ' <repeats 16
times>,"(SELECT d + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)\n", ' ' <repeats 16 times>, "UNION ALL\n", ' '
<repeats16 times>, "(SELECT d + g.i FROM generate_series(0"...
 
        firstchar = 81
        input_message = {
          data = 0x193ff00 "SELECT\n", ' ' <repeats 12 times>, "ARRAY(SELECT f.i FROM (\n", ' ' <repeats 16 times>,
"(SELECTd + g.i FROM generate_series(4, 30, 3) d ORDER BY 1)\n", ' ' <repeats 16 times>, "UNION ALL\n", ' ' <repeats 16
times>,"(SELECT d + g.i FROM generate_series(0"..., len = 318, maxlen = 1024, cursor = 318}
 
        local_sigjmp_buf = {{__jmpbuf = {-32560447, 14233748, 20054449, 26612304, 19171644,
              19545576, -1997596, 19545448, -1997608, 1537473521, 5, 1537473887, 0, 19171028,
              1, 19171836, 26637392, 26637168, 19171840, 19143524, 19171836, 671097412,
              0 <repeats 40 times>, -1, 0 <repeats 49 times>}, __mask_was_saved = 1,
            __saved_mask = {__val = {0, 0, 4, 4292971016, 15, 0, 4292971016, 4292969328, 0, 0,
                4294967295, 0, 0, 19171648, 4294967295, 19171644, 224, 19143524, 26628416,
                4292969360, 13022924, 0, 4294967295, 0, 116, 4150968980, 26628416, 0,
                13023288, 0, 0, 4292969424}}}}
        send_ready_for_query = false
        disable_idle_in_transaction_timeout = false
        __func__ = "PostgresMain"
#15 0x00d35ebc in BackendRun (port=0x1965140)
    at ./build/../src/backend/postmaster/postmaster.c:4361
        ac = 1
        secs = 590789147
        usecs = 972771
        i = 1
        av = 0x1967450
        maxac = <optimized out>
        av = <optimized out>
        maxac = <optimized out>
        ac = <optimized out>
        secs = <optimized out>
        usecs = <optimized out>
        i = <optimized out>
        __func__ = "BackendRun"
#16 BackendStartup (port=0x1965140) at ./build/../src/backend/postmaster/postmaster.c:4033
        bn = 0x1961250
        pid = <optimized out>
        bn = <optimized out>
        pid = <optimized out>
        __func__ = "BackendStartup"
        save_errno = <optimized out>
#17 ServerLoop () at ./build/../src/backend/postmaster/postmaster.c:1706
        port = 0x1965140
        i = <optimized out>
        rmask = {fds_bits = {16, 0 <repeats 31 times>}}
        selres = <optimized out>
        now = <optimized out>
        readmask = {fds_bits = {24, 0 <repeats 31 times>}}
        nSockets = 5
        last_lockfile_recheck_time = 1537473887
        last_touch_time = 1537473521
        __func__ = "ServerLoop"
#18 0x00d36bdc in PostmasterMain (argc=<optimized out>, argv=<optimized out>)
    at ./build/../src/backend/postmaster/postmaster.c:1379
        opt = <optimized out>
        status = <optimized out>
        userDoption = <optimized out>
        listen_addr_saved = <optimized out>
        i = <optimized out>
        output_config_variable = <optimized out>
        __func__ = "PostmasterMain"
#19 0x00a4e050 in main (argc=5, argv=0x193b0e0) at ./build/../src/backend/main/main.c:228
No locals.


$ cat /proc/cpuinfo
processor       : 0
cpu             : POWER8 (architected), altivec supported
clock           : 3425.000000MHz
revision        : 2.1 (pvr 004b 0201)
...
processor       : 23
cpu             : POWER8 (architected), altivec supported
clock           : 3425.000000MHz
revision        : 2.1 (pvr 004b 0201)

timebase        : 512000000
platform        : pSeries
model           : IBM,8284-22A
machine         : CHRP IBM,8284-22A
MMU             : Hash


I'll leave the session open for a while if you have more questions.

Christoph

Вложения

Re: [patch] Support LLVM 7

От
Christoph Berg
Дата:
Re: Andres Freund 2018-09-20 <20180920173238.f5idtzdlpkjsufv5@alap3.anarazel.de>
> That's pretty clearly an LLVM bug. Could you enable jit_dump_bitcode and
> send the bitcode files (I assume there should be something like
> <pid>.<generation>.bc and the same with <prefix>.optimized.bc) from the
> data directory?
> 
> Not that I think x32 is a particularly popular database platform, but
> LLVM clearly needs to be fixed independent of PG...

$ PGOPTIONS="-c jit_debugging_support=on" psql
psql (11beta4 (Debian 11~beta4-2))
postgres=# set jit=on;
postgres=# set jit_dump_bitcode = on;
postgres=# \i /home/cbe/postgresql/debian/11/src/test/regress/sql/boolean.sql 

FATAL:  fatal llvm error: Cannot select: 0x580bfdf0: ch,glue = X86ISD::CALL 0x580bfd60, 0x580bfec8, Register:i32 $edi,
RegisterMask:Untyped,0x580bfd60:1
 
  0x580bfec8: i32 = X86ISD::Wrapper TargetGlobalAddress:i32<void (%struct.TupleTableSlot*)* @deform_0_1> 0
    0x580bfe80: i32 = TargetGlobalAddress<void (%struct.TupleTableSlot*)* @deform_0_1> 0
  0x580bfd18: i32 = Register $edi
  0x580bfda8: Untyped = RegisterMask
  0x580bfd60: ch,glue = CopyToReg 0x580bfcd0, Register:i32 $edi, 0x580bfc40
    0x580bfd18: i32 = Register $edi
    0x580bfc40: i32,ch = CopyFromReg 0x5807eeec, Register:i32 %27
      0x580bfbf8: i32 = Register %27
In function: evalexpr_0_0
Server beendete die Verbindung unerwartet

gdb reports "exited with code 01" at that point.

Christoph


Re: [patch] Support LLVM 7

От
Christoph Berg
Дата:
Re: To Andres Freund 2018-09-20 <20180920210315.GB21756@msg.df7cb.de>
> Server beendete die Verbindung unerwartet

Something ate the attachments. Sorry.

FATAL:  fatal llvm error: Cannot select: 0x57e61d40: ch,glue = X86ISD::CALL 0x57e61cb0, 0x57e61e18, Register:i32 $edi,
RegisterMask:Untyped,0x57e61cb0:1
 
  0x57e61e18: i32 = X86ISD::Wrapper TargetGlobalAddress:i32<void (%struct.TupleTableSlot*)* @deform_0_1> 0
    0x57e61dd0: i32 = TargetGlobalAddress<void (%struct.TupleTableSlot*)* @deform_0_1> 0
  0x57e61c68: i32 = Register $edi
  0x57e61cf8: Untyped = RegisterMask
  0x57e61cb0: ch,glue = CopyToReg 0x57e61c20, Register:i32 $edi, 0x57e61b90
    0x57e61c68: i32 = Register $edi
    0x57e61b90: i32,ch = CopyFromReg 0x57e1fd3c, Register:i32 %27
      0x57e61b48: i32 = Register %27
In function: evalexpr_0_0
Server beendete die Verbindung unerwartet

Christoph

Вложения

Re: [patch] Support LLVM 7

От
Andres Freund
Дата:
On 2018-09-20 23:08:04 +0200, Christoph Berg wrote:
> Re: To Andres Freund 2018-09-20 <20180920210315.GB21756@msg.df7cb.de>
> > Server beendete die Verbindung unerwartet
> 
> Something ate the attachments. Sorry.
> 
> FATAL:  fatal llvm error: Cannot select: 0x57e61d40: ch,glue = X86ISD::CALL 0x57e61cb0, 0x57e61e18, Register:i32
$edi,RegisterMask:Untyped, 0x57e61cb0:1
 
>   0x57e61e18: i32 = X86ISD::Wrapper TargetGlobalAddress:i32<void (%struct.TupleTableSlot*)* @deform_0_1> 0
>     0x57e61dd0: i32 = TargetGlobalAddress<void (%struct.TupleTableSlot*)* @deform_0_1> 0
>   0x57e61c68: i32 = Register $edi
>   0x57e61cf8: Untyped = RegisterMask
>   0x57e61cb0: ch,glue = CopyToReg 0x57e61c20, Register:i32 $edi, 0x57e61b90
>     0x57e61c68: i32 = Register $edi
>     0x57e61b90: i32,ch = CopyFromReg 0x57e1fd3c, Register:i32 %27
>       0x57e61b48: i32 = Register %27
> In function: evalexpr_0_0
> Server beendete die Verbindung unerwartet

This looks like a reported LLVM bug: https://bugs.llvm.org/show_bug.cgi?id=34268

I tried to ping a few people involved in x32 on the LLVM list - not sure
if that has much of a chance. FWIW, it doesn't just appear to be
relevant for JIT, but also outside of it:
https://bugs.llvm.org/show_bug.cgi?id=36743

Greetings,

Andres Freund