missing ML message (Re: v12.0: segfault in reindex CONCURRENTLY)

Поиск
Список
Период
Сортировка
От Justin Pryzby
Тема missing ML message (Re: v12.0: segfault in reindex CONCURRENTLY)
Дата
Msg-id 20191013152052.GE4475@telsasoft.com
обсуждение исходный текст
Ответы Re: missing ML message (Re: v12.0: segfault in reindex CONCURRENTLY)  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-www
I was going to add a link to this thread in a commit message for our internal
project's repo, but found that (even 2.5 hours later) the HTTP archives don't
show my last message; I'd expect it to be here:
https://www.postgresql.org/message-id/flat/20191012004446.GT10470%40telsasoft.com

Quoted message follows, sent with headers like so:

From pryzby@telsasoft.com Sun Oct 13 08:03:21 2019
                                                              
 
Date: Sun, 13 Oct 2019 08:03:21 -0500
                                                              
 
From: Justin Pryzby <pryzby@telsasoft.com>
                                                              
 
To: Michael Paquier <michael@paquier.xyz>
                                                              
 
Cc: pgsql-hackers@lists.postgresql.org,
                                                              
 
        Peter Eisentraut <peter.eisentraut@2ndquadrant.com>,
                                                              
 
        Andreas Karlsson <andreas@proxel.se>,
                                                              
 
        Alvaro Herrera <alvherre@alvh.no-ip.org>
                                                              
 
Subject: Re: v12.0: segfault in reindex CONCURRENTLY
                                                              
 
Message-ID: <20191013130321.GD4475@telsasoft.com>
                                                              
 
References: <20191012004446.GT10470@telsasoft.com>
                                                              
 
        <20191013090643.GA1434@paquier.xyz>
                                                              
 
MIME-Version: 1.0
                                                              
 
Content-Type: text/plain; charset=us-ascii
                                                              
 
Content-Disposition: inline
                                                              
 
In-Reply-To: <20191013090643.GA1434@paquier.xyz>
                                                              
 

On Sun, Oct 13, 2019 at 08:03:21AM -0500, Justin Pryzby wrote:
> On Sun, Oct 13, 2019 at 06:06:43PM +0900, Michael Paquier wrote:
> > On Fri, Oct 11, 2019 at 07:44:46PM -0500, Justin Pryzby wrote:
> > > Unfortunately, there was no core file, and I'm still trying to reproduce it.
> > 
> > Forgot to set ulimit -c?  Having a backtrace would surely help.
> 
> Fortunately (?) another server hit crashed last night.
> (Doesn't appear to be relevant, but this table has no inheritence/partition-ness).
> 
> Looks like it's a race condition and dereferencing *holder=NULL.  The first
> crash was probably the same bug, due to report query running during "reindex
> CONCURRENTLY", and probably finished at nearly the same time as another locker.
> 
> Relevant code introduced here:
> 
> commit ab0dfc961b6a821f23d9c40c723d11380ce195a6
> Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Date:   Tue Apr 2 15:18:08 2019 -0300
> 
>     Report progress of CREATE INDEX operations
> 
> Needs to be conditionalized (as anticipated by the comment)
> 
> +            if (holder)
>                                 pgstat_progress_update_param(PROGRESS_WAITFOR_CURRENT_PID,
>                                                                                          holder->pid);
> 
> 
> Core was generated by `postgres: postgres ts [local] REINDEX                  '.
> Program terminated with signal 11, Segmentation fault.
> 
> #0  WaitForLockersMultiple (locktags=locktags@entry=0x1d30548, lockmode=lockmode@entry=5,
progress=progress@entry=true)at lmgr.c:911
 
> #1  0x00000000005c2ac8 in ReindexRelationConcurrently (relationOid=relationOid@entry=17618, options=options@entry=0)
atindexcmds.c:3090
 
> #2  0x00000000005c328a in ReindexIndex (indexRelation=<optimized out>, options=0, concurrent=<optimized out>) at
indexcmds.c:2352
> #3  0x00000000007657fe in standard_ProcessUtility (pstmt=pstmt@entry=0x1d05468,
queryString=queryString@entry=0x1d046e0"REINDEX INDEX CONCURRENTLY loaded_cdr_files_filename",
 
>     context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0, queryEnv=queryEnv@entry=0x0,
dest=dest@entry=0x1d05548,
>     completionTag=completionTag@entry=0x7ffc05e6c0a0 "") at utility.c:787
> #4  0x00007f21517204ef in pgss_ProcessUtility (pstmt=0x1d05468, queryString=0x1d046e0 "REINDEX INDEX CONCURRENTLY
loaded_cdr_files_filename",context=PROCESS_UTILITY_TOPLEVEL,
 
>     params=0x0, queryEnv=0x0, dest=0x1d05548, completionTag=0x7ffc05e6c0a0 "") at pg_stat_statements.c:1006
> #5  0x0000000000762816 in PortalRunUtility (portal=0x1d7a4e0, pstmt=0x1d05468, isTopLevel=<optimized out>,
setHoldSnapshot=<optimizedout>, dest=0x1d05548,
 
>     completionTag=0x7ffc05e6c0a0 "") at pquery.c:1175
> #6  0x0000000000763267 in PortalRunMulti (portal=portal@entry=0x1d7a4e0, isTopLevel=isTopLevel@entry=true,
setHoldSnapshot=setHoldSnapshot@entry=false,dest=dest@entry=0x1d05548,
 
>     altdest=altdest@entry=0x1d05548, completionTag=completionTag@entry=0x7ffc05e6c0a0 "") at pquery.c:1328
> #7  0x0000000000763e45 in PortalRun (portal=<optimized out>, count=9223372036854775807, isTopLevel=<optimized out>,
run_once=<optimizedout>, dest=0x1d05548, altdest=0x1d05548,
 
>     completionTag=0x7ffc05e6c0a0 "") at pquery.c:796
> #8  0x000000000075ff45 in exec_simple_query (query_string=<optimized out>) at postgres.c:1215
> #9  0x0000000000761212 in PostgresMain (argc=<optimized out>, argv=<optimized out>, dbname=<optimized out>,
username=<optimizedout>) at postgres.c:4236
 
> #10 0x0000000000483d02 in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4431
> #11 BackendStartup (port=0x1d2b340) at postmaster.c:4122
> #12 ServerLoop () at postmaster.c:1704
> #13 0x00000000006f0b1f in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x1cff280) at postmaster.c:1377
> #14 0x0000000000484c93 in main (argc=3, argv=0x1cff280) at main.c:228
> 
> bt f
> 
> #0  WaitForLockersMultiple (locktags=locktags@entry=0x1d30548, lockmode=lockmode@entry=5,
progress=progress@entry=true)at lmgr.c:911
 
>         holder = 0x0
>         lockholders = 0x1d9b778
>         holders = <optimized out>
>         lc = 0x1d9bf80
>         total = <optimized out>
>         done = 1
> #1  0x00000000005c2ac8 in ReindexRelationConcurrently (relationOid=relationOid@entry=17618, options=options@entry=0)
atindexcmds.c:3090
 
>         heapRelationIds = 0x1d30360
>         indexIds = 0x1d303b0
>         newIndexIds = <optimized out>
>         relationLocks = <optimized out>
>         lockTags = <optimized out>
>         lc = 0x0
>         lc2 = 0x0
>         private_context = <optimized out>
>         oldcontext = <optimized out>
>         relkind = 105 'i'
>         relationName = 0x0
>         relationNamespace = 0x0
>         ru0 = {tv = {tv_sec = 30592544, tv_usec = 7232025}, ru = {ru_utime = {tv_sec = 281483566645394, tv_usec =
75668733820930},ru_stime = {tv_sec = 0, tv_usec = 30592272}, { 
>               ru_maxrss = 0, __ru_maxrss_word = 0}, {ru_ixrss = 0, __ru_ixrss_word = 0}, {ru_idrss = 105,
__ru_idrss_word= 105}, {ru_isrss = -926385342574214656, 
 
>               __ru_isrss_word = -926385342574214656}, {ru_minflt = 8924839, __ru_minflt_word = 8924839}, {ru_majflt =
0,__ru_majflt_word = 0}, {ru_nswap = 17618, 
 
>               __ru_nswap_word = 17618}, {ru_inblock = 139781327898864, __ru_inblock_word = 139781327898864},
{ru_oublock= 30430312, __ru_oublock_word = 30430312}, {
 
>               ru_msgsnd = 139781327898864, __ru_msgsnd_word = 139781327898864}, {ru_msgrcv = 139781327898864,
__ru_msgrcv_word= 139781327898864}, {ru_nsignals = 139781327898864, 
 
>               __ru_nsignals_word = 139781327898864}, {ru_nvcsw = 139781327898864, __ru_nvcsw_word = 139781327898864},
{ru_nivcsw= 139781327898864, 
 
>               __ru_nivcsw_word = 139781327898864}}}
>         __func__ = "ReindexRelationConcurrently"
> #2  0x00000000005c328a in ReindexIndex (indexRelation=<optimized out>, options=0, concurrent=<optimized out>) at
indexcmds.c:2352
>         state = {concurrent = true, locked_table_oid = 17608}
>         indOid = 17618
>         irel = <optimized out>
>         persistence = 112 'p'
> ...

-- 
Justin Pryzby
System Administrator
Telsasoft
+1-952-707-8581



В списке pgsql-www по дате отправления:

Предыдущее
От: Rodrigo Ramírez Norambuena
Дата:
Сообщение: Re: PGweb: Patches and tests
Следующее
От: Tom Lane
Дата:
Сообщение: Re: missing ML message (Re: v12.0: segfault in reindex CONCURRENTLY)