Обсуждение: BUG #12918: Segfault in BackendIdGetTransactionIds
The following bug has been logged on the website: Bug reference: 12918 Logged by: Vladimir Email address: root@simply.name PostgreSQL version: 9.4.1 Operating system: RHEL 6.6 Description: Hello. After upgrading from 9.3.6 to 9.4.1 (both installed from packages on yum.postgresql.org) we have started getting segfaults of different backends. Backtraces of all coredumps look similar: (gdb) bt #0 0x000000000066bf9b in BackendIdGetTransactionIds (backendID=<value optimized out>, xid=0x7f2a1b714798, xmin=0x7f2a1b71479c) at sinvaladt.c:426 #1 0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:2871 #2 0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.c:2342 #3 0x00000000006f9d5a in pg_stat_get_db_numbackends (fcinfo=<value optimized out>) at pgstatfuncs.c:1080 #4 0x000000000059c345 in ExecMakeFunctionResultNoSets (fcache=0x1f4c270, econtext=0x1f4bbe0, isNull=0x1f5e588 "", isDone=<value optimized out>) at execQual.c:2023 #5 0x00000000005981a3 in ExecTargetList (projInfo=<value optimized out>, isDone=0x0) at execQual.c:5304 #6 ExecProject (projInfo=<value optimized out>, isDone=0x0) at execQual.c:5519 #7 0x00000000005a458d in advance_aggregates (aggstate=0x1f4bdc0, pergroup=0x1f5e380) at nodeAgg.c:556 #8 0x00000000005a4da5 in agg_retrieve_direct (node=<value optimized out>) at nodeAgg.c:1223 #9 ExecAgg (node=<value optimized out>) at nodeAgg.c:1115 #10 0x0000000000597638 in ExecProcNode (node=0x1f4bdc0) at execProcnode.c:476 #11 0x0000000000596252 in ExecutePlan (queryDesc=0x1eae6d0, direction=<value optimized out>, count=0) at execMain.c:1486 #12 standard_ExecutorRun (queryDesc=0x1eae6d0, direction=<value optimized out>, count=0) at execMain.c:319 #13 0x0000000000686797 in PortalRunSelect (portal=0x1ea5660, forward=<value optimized out>, count=0, dest=<value optimized out>) at pquery.c:946 #14 0x00000000006879c1 in PortalRun (portal=0x1ea5660, count=9223372036854775807, isTopLevel=1 '\001', dest=0x1f5a528, altdest=0x1f5a528, completionTag=0x7fff277b3b80 "") at pquery.c:790 #15 0x000000000068404e in exec_simple_query (query_string=0x1e989d0 "SELECT sum(numbackends) FROM pg_stat_database;") at postgres.c:1072 #16 0x00000000006856c8 in PostgresMain (argc=<value optimized out>, argv=<value optimized out>, dbname=0x1e7f398 "postgres", username=<value optimized out>) at postgres.c:4074 #17 0x0000000000632d7d in BackendRun (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:4155 #18 BackendStartup (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:3829 #19 ServerLoop (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1597 #20 PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1244 #21 0x00000000005cadb8 in main (argc=3, argv=0x1e7e5e0) at main.c:228 (gdb) Unfortunatelly, I can't give a clear sequence of steps to reproduce the problem, segfaults are happening in quiet random time and under random workloads :( So I'm trying to reproduce it on testing stand where PostgreSQL is built with --enable-debug flag to give you more information (but still no luck for last two weeks). The common conditions are: 1. it happens only on master hosts (never on any of the streaming replicas), 2. it happens on simple queries to pg_catalog or system views as shown in the backtrace above, 3. it happens only with direct connecting to PostgreSQL (production-queries go through pgbouncer and no coredumps contain production queries). And till now it happened only with python-psycopg2 (we have tried versions 2.5.3-1.rhel6 with postgresql93-libs, 2.5.4-1.rhel6 and 2.6-1.rhel6 with postgresql94-libs). I've asked about it on psycopg-list [0] but it doesn't seem to be the client problem. [0] http://www.postgresql.org/message-id/flat/CA+mi_8a246TK6YBLzf_7c5sc+XuiMaGafG0mhrFbp6Nq+SQt3w@mail.gmail.com#CA+mi_8a246TK6YBLzf_7c5sc+XuiMaGafG0mhrFbp6Nq+SQt3w@mail.gmail.com
root@simply.name writes:
> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
> yum.postgresql.org) we have started getting segfaults of different backends.
> Backtraces of all coredumps look similar:
> (gdb) bt
> #0 0x000000000066bf9b in BackendIdGetTransactionIds (backendID=<value
> optimized out>, xid=0x7f2a1b714798, xmin=0x7f2a1b71479c) at sinvaladt.c:426
> #1 0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:2871
> #2 0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.c:2342
Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
It supposes that there are no inactive entries in the sinval array
within the range 0 .. lastBackend. But there can be, in which case
dereferencing stateP->proc crashes. The reason it's hard to reproduce
is the relatively narrow window between where pgstat_read_current_status
saw the backend as active and where we're inspecting its sinval entry.
regards, tom lane
> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 19:33, Tom Lane = <tgl@sss.pgh.pa.us> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0): >=20 > root@simply.name writes: >> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on >> yum.postgresql.org) we have started getting segfaults of different = backends. >> Backtraces of all coredumps look similar: >> (gdb) bt >> #0 0x000000000066bf9b in BackendIdGetTransactionIds = (backendID=3D<value >> optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at = sinvaladt.c:426 >> #1 0x00000000006287f4 in pgstat_read_current_status () at = pgstat.c:2871 >> #2 0x0000000000628879 in pgstat_fetch_stat_numbackends () at = pgstat.c:2342 >=20 > Hmm ... looks to me like BackendIdGetTransactionIds is simply busted. > It supposes that there are no inactive entries in the sinval array > within the range 0 .. lastBackend. But there can be, in which case > dereferencing stateP->proc crashes. The reason it's hard to reproduce > is the relatively narrow window between where = pgstat_read_current_status > saw the backend as active and where we're inspecting its sinval entry. I=E2=80=99ve also tried to revert dd1a3bcc where this function appeared = but couldn=E2=80=99t do it :( If you would be able to make a build = without this commit (if it is easier than fix it in right way), I could = install it on several production hosts to test it. >=20 > regards, tom lane -- May the force be with you=E2=80=A6 https://simply.name
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> root@simply.name writes:
> > After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
> > yum.postgresql.org) we have started getting segfaults of different back=
ends.
> > Backtraces of all coredumps look similar:
> > (gdb) bt
> > #0 0x000000000066bf9b in BackendIdGetTransactionIds (backendID=3D<value
> > optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at sinvala=
dt.c:426
> > #1 0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:2871
> > #2 0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.c:=
2342
>=20
> Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
> It supposes that there are no inactive entries in the sinval array
> within the range 0 .. lastBackend. But there can be, in which case
> dereferencing stateP->proc crashes. The reason it's hard to reproduce
> is the relatively narrow window between where pgstat_read_current_status
> saw the backend as active and where we're inspecting its sinval entry.
As an immediate short-term workaround, from what I can tell,=20
disabling calls to pg_stat_activity, and pg_stat_database (views), and
pg_stat_get_activity, pg_stat_get_backend_idset, and
pg_stat_get_db_numbackends (functions) should prevent triggering this
bug.
These are likely being run by a monitoring system (eg: check_postgres
=66rom Nagios).
Thanks!
Stephen
> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 19:44, Stephen = Frost <sfrost@snowman.net> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0= ): >=20 > * Tom Lane (tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>) wrote: >> root@simply.name writes: >>> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on >>> yum.postgresql.org) we have started getting segfaults of different = backends. >>> Backtraces of all coredumps look similar: >>> (gdb) bt >>> #0 0x000000000066bf9b in BackendIdGetTransactionIds = (backendID=3D<value >>> optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at = sinvaladt.c:426 >>> #1 0x00000000006287f4 in pgstat_read_current_status () at = pgstat.c:2871 >>> #2 0x0000000000628879 in pgstat_fetch_stat_numbackends () at = pgstat.c:2342 >>=20 >> Hmm ... looks to me like BackendIdGetTransactionIds is simply busted. >> It supposes that there are no inactive entries in the sinval array >> within the range 0 .. lastBackend. But there can be, in which case >> dereferencing stateP->proc crashes. The reason it's hard to = reproduce >> is the relatively narrow window between where = pgstat_read_current_status >> saw the backend as active and where we're inspecting its sinval = entry. >=20 > As an immediate short-term workaround, from what I can tell,=20 > disabling calls to pg_stat_activity, and pg_stat_database (views), and > pg_stat_get_activity, pg_stat_get_backend_idset, and > pg_stat_get_db_numbackends (functions) should prevent triggering this > bug. I suppose, pg_stat_replication should not be asked too. We have already = done that on most critical databases but it is hard to be blind :( >=20 > These are likely being run by a monitoring system (eg: check_postgres > from Nagios). >=20 > Thanks! >=20 > Stephen -- May the force be with you=E2=80=A6 https://simply.name
* Vladimir Borodin (root@simply.name) wrote:
> > 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 19:44, Stephen F=
rost <sfrost@snowman.net> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0=
):
> > * Tom Lane (tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>) wrote:
> >> root@simply.name writes:
> >>> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
> >>> yum.postgresql.org) we have started getting segfaults of different ba=
ckends.
> >>> Backtraces of all coredumps look similar:
> >>> (gdb) bt
> >>> #0 0x000000000066bf9b in BackendIdGetTransactionIds (backendID=3D<va=
lue
> >>> optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at sinva=
ladt.c:426
> >>> #1 0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:2=
871
> >>> #2 0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.=
c:2342
> >>=20
> >> Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
> >> It supposes that there are no inactive entries in the sinval array
> >> within the range 0 .. lastBackend. But there can be, in which case
> >> dereferencing stateP->proc crashes. The reason it's hard to reproduce
> >> is the relatively narrow window between where pgstat_read_current_stat=
us
> >> saw the backend as active and where we're inspecting its sinval entry.
> >=20
> > As an immediate short-term workaround, from what I can tell,=20
> > disabling calls to pg_stat_activity, and pg_stat_database (views), and
> > pg_stat_get_activity, pg_stat_get_backend_idset, and
> > pg_stat_get_db_numbackends (functions) should prevent triggering this
> > bug.
>=20
> I suppose, pg_stat_replication should not be asked too. We have already d=
one that on most critical databases but it is hard to be blind :(
Ah, yes, not sure where I dropped that; it was in my initial list but
didn't make it into the final email.
I would expect a fix to be included in the next point release, hopefully
released in the next couple of months.
Thanks!
Stephen
* Vladimir Borodin (root@simply.name) wrote:
>=20
> > 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 19:33, Tom Lane =
<tgl@sss.pgh.pa.us> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0):
> >=20
> > root@simply.name writes:
> >> After upgrading from 9.3.6 to 9.4.1 (both installed from packages on
> >> yum.postgresql.org) we have started getting segfaults of different bac=
kends.
> >> Backtraces of all coredumps look similar:
> >> (gdb) bt
> >> #0 0x000000000066bf9b in BackendIdGetTransactionIds (backendID=3D<val=
ue
> >> optimized out>, xid=3D0x7f2a1b714798, xmin=3D0x7f2a1b71479c) at sinval=
adt.c:426
> >> #1 0x00000000006287f4 in pgstat_read_current_status () at pgstat.c:28=
71
> >> #2 0x0000000000628879 in pgstat_fetch_stat_numbackends () at pgstat.c=
:2342
> >=20
> > Hmm ... looks to me like BackendIdGetTransactionIds is simply busted.
> > It supposes that there are no inactive entries in the sinval array
> > within the range 0 .. lastBackend. But there can be, in which case
> > dereferencing stateP->proc crashes. The reason it's hard to reproduce
> > is the relatively narrow window between where pgstat_read_current_status
> > saw the backend as active and where we're inspecting its sinval entry.
>=20
> I=E2=80=99ve also tried to revert dd1a3bcc where this function appeared b=
ut couldn=E2=80=99t do it :( If you would be able to make a build without t=
his commit (if it is easier than fix it in right way), I could install it o=
n several production hosts to test it.
Hopefully a fix will be forthcoming shortly. Reverting it won't work
though, no, as it included a catalog bump.
Thanks,
Stephen
Vladimir Borodin <root@simply.name> writes:
> I���ve also tried to revert dd1a3bcc where this function appeared but couldn���t do it :( If you would be able to
makea build without this commit (if it is easier than fix it in right way), I could install it on several production
hoststo test it.
Try this.
regards, tom lane
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index 81b85c0..a2fde89 100644
*** a/src/backend/storage/ipc/sinvaladt.c
--- b/src/backend/storage/ipc/sinvaladt.c
*************** BackendIdGetProc(int backendID)
*** 403,411 ****
void
BackendIdGetTransactionIds(int backendID, TransactionId *xid, TransactionId *xmin)
{
- ProcState *stateP;
SISeg *segP = shmInvalBuffer;
- PGXACT *xact;
*xid = InvalidTransactionId;
*xmin = InvalidTransactionId;
--- 403,409 ----
*************** BackendIdGetTransactionIds(int backendID
*** 415,425 ****
if (backendID > 0 && backendID <= segP->lastBackend)
{
! stateP = &segP->procState[backendID - 1];
! xact = &ProcGlobal->allPgXact[stateP->proc->pgprocno];
! *xid = xact->xid;
! *xmin = xact->xmin;
}
LWLockRelease(SInvalWriteLock);
--- 413,428 ----
if (backendID > 0 && backendID <= segP->lastBackend)
{
! ProcState *stateP = &segP->procState[backendID - 1];
! PGPROC *proc = stateP->proc;
! if (proc != NULL)
! {
! PGXACT *xact = &ProcGlobal->allPgXact[proc->pgprocno];
!
! *xid = xact->xid;
! *xmin = xact->xmin;
! }
}
LWLockRelease(SInvalWriteLock);
> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 20:00, Tom Lane =
<tgl@sss.pgh.pa.us> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0):
>=20
> Vladimir Borodin <root@simply.name> writes:
>> I=E2=80=99ve also tried to revert dd1a3bcc where this function =
appeared but couldn=E2=80=99t do it :( If you would be able to make a =
build without this commit (if it is easier than fix it in right way), I =
could install it on several production hosts to test it.
>=20
> Try this.
38 minutes from a bug report to the patch with a fix! You are fantastic. =
Thanks.
It compiles, passes 'make check' and 'make check-world=E2=80=99 (I =
think, you have checked it but just in case...). I=E2=80=99ve built a =
package and installed it on one host. If everything would be ok, =
tomorrow I will install it on several hosts and slowly farther. The =
problem reproduces on our number of hosts approximately once a week. If =
the problem disappears I will let you know in a couple of weeks.
Thanks again.
>=20
> regards, tom lane
>=20
> diff --git a/src/backend/storage/ipc/sinvaladt.c =
b/src/backend/storage/ipc/sinvaladt.c
> index 81b85c0..a2fde89 100644
> *** a/src/backend/storage/ipc/sinvaladt.c
> --- b/src/backend/storage/ipc/sinvaladt.c
> *************** BackendIdGetProc(int backendID)
> *** 403,411 ****
> void
> BackendIdGetTransactionIds(int backendID, TransactionId *xid, =
TransactionId *xmin)
> {
> - ProcState *stateP;
> SISeg *segP =3D shmInvalBuffer;
> - PGXACT *xact;
>=20
> *xid =3D InvalidTransactionId;
> *xmin =3D InvalidTransactionId;
> --- 403,409 ----
> *************** BackendIdGetTransactionIds(int backendID
> *** 415,425 ****
>=20
> if (backendID > 0 && backendID <=3D segP->lastBackend)
> {
> ! stateP =3D &segP->procState[backendID - 1];
> ! xact =3D &ProcGlobal->allPgXact[stateP->proc->pgprocno];
>=20
> ! *xid =3D xact->xid;
> ! *xmin =3D xact->xmin;
> }
>=20
> LWLockRelease(SInvalWriteLock);
> --- 413,428 ----
>=20
> if (backendID > 0 && backendID <=3D segP->lastBackend)
> {
> ! ProcState *stateP =3D &segP->procState[backendID - 1];
> ! PGPROC *proc =3D stateP->proc;
>=20
> ! if (proc !=3D NULL)
> ! {
> ! PGXACT *xact =3D =
&ProcGlobal->allPgXact[proc->pgprocno];
> !=20
> ! *xid =3D xact->xid;
> ! *xmin =3D xact->xmin;
> ! }
> }
>=20
> LWLockRelease(SInvalWriteLock);
--
May the force be with you=E2=80=A6
https://simply.name
On Mon, 30 Mar 2015 13:00:01 -0400
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Vladimir Borodin <root@simply.name> writes:
> > I=E2=80=99ve also tried to revert dd1a3bcc where this function appeared=
but couldn=E2=80=99t do it :( If you would be able to make a build without=
this commit (if it is easier than fix it in right way), I could install it=
on several production hosts to test it.
>=20
> Try this.
Nice to see a patch, in advance of need ;-) Thanks!
We have had a couple segfaults recently but once we enabled core files it
stopped happening. Until just now. I can build with the
patch, but if a 9.4.2 is immanent it would be nice to know before
scheduling an extra round of downtimes.
This is apparently from a python trigger calling get_app_name(). I
can provide the rest of the stack if it would be useful.
Program terminated with signal 11, Segmentation fault.
#0 0x000000000066148b in BackendIdGetTransactionIds (backendID=3D<value op=
timized out>, xid=3D0x7f5d56ae1598, xmin=3D0x7f5d56ae159c)
at sinvaladt.c:426
426 sinvaladt.c: No such file or directory.
in sinvaladt.c
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.149.el6_6.=
5.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0 0x000000000066148b in BackendIdGetTransactionIds (backendID=3D<value op=
timized out>, xid=3D0x7f5d56ae1598, xmin=3D0x7f5d56ae159c)
at sinvaladt.c:426
#1 0x000000000061f064 in pgstat_read_current_status () at pgstat.c:2871
#2 0x000000000061f0e9 in pgstat_fetch_stat_numbackends () at pgstat.c:2342
#3 0x00000000006ef373 in pg_stat_get_activity (fcinfo=3D0x7fffd2e78f50) at=
pgstatfuncs.c:591
#4 0x00000000005977ec in ExecMakeTableFunctionResult (funcexpr=3D0x17fdae0=
, econtext=3D0x17fd770, argContext=3D<value optimized out>,=20
expectedDesc=3D0x17ffd70, randomAccess=3D0 '\000') at execQual.c:2193
#5 0x00000000005a91f2 in FunctionNext (node=3D0x17fd660) at nodeFunctionsc=
an.c:95
#6 0x00000000005982ce in ExecScanFetch (node=3D0x17fd660, accessMtd=3D0x5a=
8f40 <FunctionNext>, recheckMtd=3D0x5a8870 <FunctionRecheck>)
at execScan.c:82
#7 ExecScan (node=3D0x17fd660, accessMtd=3D0x5a8f40 <FunctionNext>, rechec=
kMtd=3D0x5a8870 <FunctionRecheck>) at execScan.c:167
#8 0x00000000005913c8 in ExecProcNode (node=3D0x17fd660) at execProcnode.c=
:426
#9 0x000000000058ff32 in ExecutePlan (queryDesc=3D0x17f81f0, direction=3D<=
value optimized out>, count=3D1) at execMain.c:1486
#10 standard_ExecutorRun (queryDesc=3D0x17f81f0, direction=3D<value optimiz=
ed out>, count=3D1) at execMain.c:319
#11 0x00007f69a7d3867b in explain_ExecutorRun (queryDesc=3D0x17f81f0, direc=
tion=3DForwardScanDirection, count=3D1) at auto_explain.c:243
#12 0x00007f69a7b33965 in pgss_ExecutorRun (queryDesc=3D0x17f81f0, directio=
n=3DForwardScanDirection, count=3D1)
at pg_stat_statements.c:873
#13 0x000000000059bd6c in postquel_getnext (fcinfo=3D<value optimized out>)=
at functions.c:853
#14 fmgr_sql (fcinfo=3D<value optimized out>) at functions.c:1148
#15 0x0000000000595f85 in ExecMakeFunctionResultNoSets (fcache=3D0x17ed920,=
econtext=3D0x17ed730, isNull=3D0x17ee2a8 " ",=20
isDone=3D<value optimized out>) at execQual.c:2023
#16 0x0000000000591e53 in ExecTargetList (projInfo=3D<value optimized out>,=
isDone=3D0x7fffd2e798fc) at execQual.c:5304
#17 ExecProject (projInfo=3D<value optimized out>, isDone=3D0x7fffd2e798fc)=
at execQual.c:5519
#18 0x00000000005a98fb in ExecResult (node=3D0x17ed620) at nodeResult.c:155
#19 0x0000000000591478 in ExecProcNode (node=3D0x17ed620) at execProcnode.c=
:373
#20 0x000000000058ff32 in ExecutePlan (queryDesc=3D0x166c610, direction=3D<=
value optimized out>, count=3D0) at execMain.c:1486
#21 standard_ExecutorRun (queryDesc=3D0x166c610, direction=3D<value optimiz=
ed out>, count=3D0) at execMain.c:319
#22 0x00007f69a7d3867b in explain_ExecutorRun (queryDesc=3D0x166c610, direc=
tion=3DForwardScanDirection, count=3D0) at auto_explain.c:243
#23 0x00007f69a7b33965 in pgss_ExecutorRun (queryDesc=3D0x166c610, directio=
n=3DForwardScanDirection, count=3D0)
at pg_stat_statements.c:873
#24 0x00000000005b39d0 in _SPI_pquery (plan=3D0x7fffd2e79d10, paramLI=3D0x0=
, snapshot=3D<value optimized out>, crosscheck_snapshot=3D0x0,=20
read_only=3D0 '\000', fire_triggers=3D1 '\001', tcount=3D0) at spi.c:23=
72
#25 _SPI_execute_plan (plan=3D0x7fffd2e79d10, paramLI=3D0x0, snapshot=3D<va=
lue optimized out>, crosscheck_snapshot=3D0x0,=20
read_only=3D0 '\000', fire_triggers=3D1 '\001', tcount=3D0) at spi.c:21=
60
#26 0x00000000005b4076 in SPI_execute (src=3D0x15f6054 "SELECT get_app_name=
() AS a", read_only=3D0 '\000', tcount=3D0) at spi.c:386
#27 0x00007f5d5672f702 in PLy_spi_execute_query (query=3D0x15f6054 "SELECT =
get_app_name() AS a", limit=3D0) at plpy_spi.c:357
-dg
--=20
David Gould 510 282 0869 daveg@sonic.net
If simplicity worked, the world would be overrun with insects.
David Gould <daveg@sonic.net> writes:
> We have had a couple segfaults recently but once we enabled core files it
> stopped happening. Until just now. I can build with the
> patch, but if a 9.4.2 is immanent it would be nice to know before
> scheduling an extra round of downtimes.
No plans for an imminent 9.4.2. There's been some discussion about a set
of releases in May; the only way something happens sooner than that is
if we find a staggeringly-bad bug.
regards, tom lane
> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 20:54, Vladimir =
Borodin <root@simply.name> =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=BB(=D0=B0=
):
>=20
>>=20
>> 30 =D0=BC=D0=B0=D1=80=D1=82=D0=B0 2015 =D0=B3., =D0=B2 20:00, Tom =
Lane <tgl@sss.pgh.pa.us <mailto:tgl@sss.pgh.pa.us>> =D0=BD=D0=B0=D0=BF=D0=B8=
=D1=81=D0=B0=D0=BB(=D0=B0):
>>=20
>> Vladimir Borodin <root@simply.name <mailto:root@simply.name>> writes:
>>> I=E2=80=99ve also tried to revert dd1a3bcc where this function =
appeared but couldn=E2=80=99t do it :( If you would be able to make a =
build without this commit (if it is easier than fix it in right way), I =
could install it on several production hosts to test it.
>>=20
>> Try this.
>=20
> 38 minutes from a bug report to the patch with a fix! You are =
fantastic. Thanks.
>=20
> It compiles, passes 'make check' and 'make check-world=E2=80=99 (I =
think, you have checked it but just in case...). I=E2=80=99ve built a =
package and installed it on one host. If everything would be ok, =
tomorrow I will install it on several hosts and slowly farther. The =
problem reproduces on our number of hosts approximately once a week. If =
the problem disappears I will let you know in a couple of weeks.
No segfaults for more than a week since I=E2=80=99ve upgraded all hosts. =
Seems, that the patch is good. Thank you very much.
>=20
> Thanks again.
>=20
>>=20
>> regards, tom lane
>>=20
>> diff --git a/src/backend/storage/ipc/sinvaladt.c =
b/src/backend/storage/ipc/sinvaladt.c
>> index 81b85c0..a2fde89 100644
>> *** a/src/backend/storage/ipc/sinvaladt.c
>> --- b/src/backend/storage/ipc/sinvaladt.c
>> *************** BackendIdGetProc(int backendID)
>> *** 403,411 ****
>> void
>> BackendIdGetTransactionIds(int backendID, TransactionId *xid, =
TransactionId *xmin)
>> {
>> - ProcState *stateP;
>> SISeg *segP =3D shmInvalBuffer;
>> - PGXACT *xact;
>>=20
>> *xid =3D InvalidTransactionId;
>> *xmin =3D InvalidTransactionId;
>> --- 403,409 ----
>> *************** BackendIdGetTransactionIds(int backendID
>> *** 415,425 ****
>>=20
>> if (backendID > 0 && backendID <=3D segP->lastBackend)
>> {
>> ! stateP =3D &segP->procState[backendID - 1];
>> ! xact =3D &ProcGlobal->allPgXact[stateP->proc->pgprocno];
>>=20
>> ! *xid =3D xact->xid;
>> ! *xmin =3D xact->xmin;
>> }
>>=20
>> LWLockRelease(SInvalWriteLock);
>> --- 413,428 ----
>>=20
>> if (backendID > 0 && backendID <=3D segP->lastBackend)
>> {
>> ! ProcState *stateP =3D &segP->procState[backendID - 1];
>> ! PGPROC *proc =3D stateP->proc;
>>=20
>> ! if (proc !=3D NULL)
>> ! {
>> ! PGXACT *xact =3D =
&ProcGlobal->allPgXact[proc->pgprocno];
>> !=20
>> ! *xid =3D xact->xid;
>> ! *xmin =3D xact->xmin;
>> ! }
>> }
>>=20
>> LWLockRelease(SInvalWriteLock);
>=20
>=20
> --
> May the force be with you=E2=80=A6
> https://simply.name <https://simply.name/>
--
May the force be with you=E2=80=A6
https://simply.name