Обсуждение: We've broken something in error recovery
In a somewhat misguided attempt to test something else, I did this in
CVS HEAD:
do $$beginfor i in 1 .. 10000 loop execute 'create table t' || i::text || ' (f1 int primary key)';end loop;end$$;
This ran for awhile and then ran out of lock table space, which was
not surprising in hindsight:
ERROR: out of shared memory
HINT: You might need to increase max_locks_per_transaction.
But what was surprising was what happened next: the autovac launcher
immediately crashed.
TRAP: FailedAssertion("!(nestLevel > 0 && nestLevel <= GUCNestLevel)", File: "guc.c", Line: 3907)
LOG: autovacuum launcher process (PID 25220) was terminated by signal 6
Stack trace looks like
#4 0x4e85b4 in ExceptionalCondition ( conditionName=0x1ac4ac "!(nestLevel > 0 && nestLevel <= GUCNestLevel)",
errorType=0x1abf44"FailedAssertion", fileName=0x1abee4 "guc.c", lineNumber=3907) at assert.c:57
#5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907
#6 0x20618c in AbortTransaction () at xact.c:2194
#7 0x20688c in AbortCurrentTransaction () at xact.c:2568
#8 0x3b0f84 in AutoVacLauncherMain (argc=2063670312, argv=0x7b03b94c) at autovacuum.c:491
#9 0x3b0bd8 in StartAutoVacLauncher () at autovacuum.c:371
Haven't dug any deeper yet --- who's touched this code lately?
regards, tom lane
Tom Lane wrote: > #4 0x4e85b4 in ExceptionalCondition ( > conditionName=0x1ac4ac "!(nestLevel > 0 && nestLevel <= GUCNestLevel)", > errorType=0x1abf44 "FailedAssertion", fileName=0x1abee4 "guc.c", > lineNumber=3907) at assert.c:57 > #5 0x501f48 in AtEOXact_GUC (isCommit=-86 'ª', nestLevel=84) at guc.c:3907 > #6 0x20618c in AbortTransaction () at xact.c:2194 > > This looks like maybe a corrupted stack - the args to AtEOXact_GUC at that location in xact.c are hardwired. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> #5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907
> This looks like maybe a corrupted stack - the args to AtEOXact_GUC at
> that location in xact.c are hardwired.
No, that's just a fairly typical behavior of debugging with -O greater
than zero --- the registers holding those parameter values got recycled
for something else. This is a rather old version of gdb and it doesn't
always print <<value optimized away>> when it should.
regards, tom lane
I wrote:
> #4 0x4e85b4 in ExceptionalCondition (
> conditionName=0x1ac4ac "!(nestLevel > 0 && nestLevel <= GUCNestLevel)",
> errorType=0x1abf44 "FailedAssertion", fileName=0x1abee4 "guc.c",
> lineNumber=3907) at assert.c:57
> #5 0x501f48 in AtEOXact_GUC (isCommit=-86 '�', nestLevel=84) at guc.c:3907
> #6 0x20618c in AbortTransaction () at xact.c:2194
> #7 0x20688c in AbortCurrentTransaction () at xact.c:2568
> #8 0x3b0f84 in AutoVacLauncherMain (argc=2063670312, argv=0x7b03b94c)
> at autovacuum.c:491
On investigation I think that Assert may just be overenthusiastic.
The problem is that StartTransaction is failing at
VirtualXactLockTableInsert, for lack of any shared memory to acquire
the lock with; and then we try to do AbortTransaction and GUC is
unhappy because it's not been initialized yet. So this isn't a
new bug at all, it's been there awhile ...
regards, tom lane