Consistent segfault in complex query

Поиск
Список
Период
Сортировка
От Kyle Samson
Тема Consistent segfault in complex query
Дата
Msg-id A033A40A-B234-4324-BE37-272279F7B627@tripadvisor.com
обсуждение исходный текст
Ответы Re: Consistent segfault in complex query  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Список pgsql-hackers
Hello,

We encountered a query that has been able to frequently segfault one of our postgres instances under certain conditions
whichwe have not fully been able to isolate for reproduction. We were able to get a core dump out of one of the crashes
andhave poked at it, but we believe the answer is beyond our knowledge of postgres internals. This is on a 9.3.19
serverand we saw no mention of a fix in the release notes since this version and we do not know if it affects later
majorreleases as well. 

What we know so far is that somehow the arraycontains function was given a datum of 0 as the second argument that
dereferencedto a null pointer. Our current hypothesis from poking at the core dump is that some memory context is
gettingfreed before it should. This assumption comes from the complexity in the query (CTE containing params being
repeatedlyevaluated by multiple case statements) and the unpredictability of the failure case. 

The issue is easily avoidable, and we have asked the developer to solve their problem differently.  However, the
existenceof a segfault is always concerning and we are reporting this issue in an effort to be conscientiousness
membersof the community. 

Due to the potentially sensitive contents we cannot provide the core directly, but we are happy to run commands against
thecore file to extract debugging information. We have also replaced certain values (database name, table name, column
name)with generic identifiers. 



                                                    version
----------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.3.19 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-18), 64-bit
(1 row)

Query:

    WITH tmp(foo2, bar2, baz2, minutes2) AS
    (
      SELECT ARRAY[$1 ::text], ARRAY[$2 ::text], ARRAY[$3 ::text], $4 ::double precision
    )

    UPDATE table_name
    SET occurrences = CASE
                      WHEN time_last_noticed >= current_timestamp - (SELECT minutes2 FROM tmp) * interval '1 minutes'
                        THEN occurrences + 1
                      ELSE 1
                      END,
        foo = CASE
              WHEN  time_last_noticed < current_timestamp - (SELECT minutes2 FROM tmp) * interval '1 minutes'
                THEN (SELECT foo2 FROM tmp)
              WHEN foo @> (SELECT foo2 FROM tmp)
                THEN foo
              ELSE array_cat(foo, (SELECT foo2 FROM tmp))
              END,
        bar = CASE
                WHEN  time_last_noticed < current_timestamp - (SELECT minutes2 FROM tmp) * interval '1 minutes'
                  THEN (SELECT bar2 FROM tmp)
                WHEN bar @> (SELECT bar2 FROM tmp)
                  THEN bar
                ELSE array_cat(bar, (SELECT bar2 FROM tmp))
                END,
        baz = CASE
                     WHEN  time_last_noticed < current_timestamp - (SELECT minutes2 FROM tmp) * interval '1 minutes'
                      THEN (SELECT baz2 FROM tmp)
                     WHEN baz @> (SELECT baz2 FROM tmp)
                      THEN baz
                     ELSE array_cat(baz, (SELECT baz2 FROM tmp))
                     END,
        time_last_noticed = current_timestamp,
        total_occurrences = total_occurrences + 1
    WHERE id1 = $5 AND id2 = $6;

Table schema:

                             Table "public.table_name"
          Column       |            Type             | Modifiers | Storage  | Stats target | Description
    -------------------+-----------------------------+-----------+----------+--------------+-------------
     id1               | text                        | not null  | extended |              |
     id2               | text                        | not null  | extended |              |
     occurrences       | integer                     | not null  | plain    |              |
     time_last_noticed | timestamp without time zone | not null  | plain    |              |
     total_occurrences | integer                     | not null  | plain    |              |
     bar               | text[]                      |           | extended |              |
     baz               | text[]                      |           | extended |              |
     foo               | text[]                      |           | extended |              |
    Indexes:
        "table_name_pkey" PRIMARY KEY, btree (id1, id2)

Back trace from the dump:

    (gdb) bt
    #0  pg_detoast_datum (datum=0x0) at fmgr.c:2241
    #1  0x000000000067cd90 in arraycontains (fcinfo=0x2c56e30) at arrayfuncs.c:3841
    #2  0x000000000058ba25 in ExecMakeFunctionResultNoSets (fcache=0x2c56dc0, econtext=0x2c500c0, isNull=0x7ffc571d9a3f
"",isDone=<value optimized out>) at execQual.c:2027 
    #3  0x0000000000587375 in ExecEvalCase (caseExpr=0x2c54a30, econtext=0x2c500c0, isNull=0x2c60967 "",
isDone=0x2c60c4c)at execQual.c:2985 
    #4  0x00000000005878c3 in ExecTargetList (projInfo=<value optimized out>, isDone=0x7ffc571d9b0c) at execQual.c:5322
    #5  ExecProject (projInfo=<value optimized out>, isDone=0x7ffc571d9b0c) at execQual.c:5537
    #6  0x000000000058dd62 in ExecScan (node=0x2c4ffb0, accessMtd=0x599340 <IndexNext>, recheckMtd=0x5992f0
<IndexRecheck>)at execScan.c:207 
    #7  0x0000000000586e28 in ExecProcNode (node=0x2c4ffb0) at execProcnode.c:404
    #8  0x0000000000584472 in EvalPlanQualNext (epqstate=<value optimized out>) at execMain.c:2366
    #9  0x0000000000584b47 in EvalPlanQual (estate=0x2bdd240, epqstate=0x2c0d638, relation=<value optimized out>,
rti=1,lockmode=<value optimized out>, tid=0x7ffc571d9c60, priorXmax=2185728490) at execMain.c:1951 
    #10 0x000000000059d7cb in ExecUpdate (node=0x2c0d598) at nodeModifyTable.c:727
    #11 ExecModifyTable (node=0x2c0d598) at nodeModifyTable.c:997
    #12 0x0000000000586e78 in ExecProcNode (node=0x2c0d598) at execProcnode.c:377
    #13 0x0000000000585a82 in ExecutePlan (queryDesc=0x2bdd000, direction=<value optimized out>, count=0) at
execMain.c:1488
    #14 standard_ExecutorRun (queryDesc=0x2bdd000, direction=<value optimized out>, count=0) at execMain.c:318
    #15 0x00007f2e5f3b358b in explain_ExecutorRun (queryDesc=0x2bdd000, direction=ForwardScanDirection, count=0) at
auto_explain.c:231
    #16 0x00007f2e5f1af495 in pgss_ExecutorRun (queryDesc=0x2bdd000, direction=ForwardScanDirection, count=0) at
pg_stat_statements.c:716
    #17 0x000000000066553f in ProcessQuery (plan=0x2be32b0, sourceText=0x2be7fe0 "WITH tmp(foo2, bar2, baz2, minutes2)
AS( SELECT ARRAY[$1 ::text], ARRAY[$2 ::text], ARRAY[$3 ::text], $4 ::double precision ) UPDATE table_name SET
occurrences= CAS"..., params=0x2bdce30, dest=<value optimized out>, completionTag=0x7ffc571da2f0 "") at pquery.c:185 
    #18 0x000000000066576f in PortalRunMulti (portal=0x2bd2530, isTopLevel=1 '\001', dest=0xb1d380, altdest=0xb1d380,
completionTag=0x7ffc571da2f0"") at pquery.c:1275 
    #19 0x0000000000665e32 in PortalRun (portal=0x2bd2530, count=9223372036854775807, isTopLevel=1 '\001',
dest=0x2ba30b0,altdest=0x2ba30b0, completionTag=0x7ffc571da2f0 "") at pquery.c:812 
    #20 0x000000000066409d in exec_execute_message (argc=<value optimized out>, argv=<value optimized out>,
dbname=0x2965280"database_name", username=<value optimized out>) at postgres.c:1958 
    #21 PostgresMain (argc=<value optimized out>, argv=<value optimized out>, dbname=0x2965280 "database_name",
username=<valueoptimized out>) at postgres.c:4154 
    #22 0x000000000061df38 in BackendRun (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4194
    #23 BackendStartup (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:3857
    #24 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1702
    #25 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1326
    #26 0x00000000005b92f8 in main (argc=5, argv=0x2964470) at main.c:234

fcinfo contents from the arraycontains in frame #1

    (gdb) print *fcinfo
    $1 = {flinfo = 0x2c56de0, context = 0x0, resultinfo = 0x0, fncollation = 100, isnull = 0 '\000', nargs = 2, arg =
{46431788,0 <repeats 99 times>}, argnull = '\000' <repeats 99 times>} 

- Kyle Samson

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Fabien COELHO
Дата:
Сообщение: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Следующее
От: Tom Lane
Дата:
Сообщение: Re: [Patch] Create a new session in postmaster by calling setsid()