Tom Lane wrote:
> Seneca Cunningham <tentra@gmail.com> writes:
>> I don't have a core, but here's the CrashReporter output for both
>> of jackal's failed runs:
>
> Wow, some actual data, rather than just noodling about how to get it ...
> thanks!
>
>> ...
>> 11 postgres 0x0022b2e3 RelationIdGetRelation + 110 (relcache.c:1496)
>> 12 postgres 0x00020868 relation_open + 84 (heapam.c:697)
>> 13 postgres 0x0002aab9 index_open + 32 (indexam.c:140)
>> 14 postgres 0x0002a9d4 systable_beginscan + 289 (genam.c:184)
>> 15 postgres 0x002279e4 RelationInitIndexAccessInfo + 1645 (relcache.c:1200)
>> 16 postgres 0x0022926a RelationBuildDesc + 3527 (relcache.c:866)
>> 17 postgres 0x0022b2e3 RelationIdGetRelation + 110 (relcache.c:1496)
>> 18 postgres 0x00020868 relation_open + 84 (heapam.c:697)
>> 19 postgres 0x0002aab9 index_open + 32 (indexam.c:140)
>> 20 postgres 0x0002a9d4 systable_beginscan + 289 (genam.c:184)
>> 21 postgres 0x002279e4 RelationInitIndexAccessInfo + 1645 (relcache.c:1200)
>> 22 postgres 0x0022926a RelationBuildDesc + 3527 (relcache.c:866)
>> 23 postgres 0x0022b2e3 RelationIdGetRelation + 110 (relcache.c:1496)
>> ...
>
> What you seem to have here is infinite recursion during relcache
> initialization. That's surely not hard to believe, considering I just
> whacked that code around, and indeed changed some of the tests that are
> intended to prevent such recursion. But what I don't understand is why
> it'd be platform-specific, much less not perfectly repeatable on the
> platforms where it does manifest. Anyone have a clue?
fwiw - I can trigger that issue now pretty reliably on a fast Opteron
box (running Debian Sarge/AMD64) with make regress in a loop - I seem to
be able to trigger it in about 20-25% of the runs.
the resulting core however looks totally stack corrupted and not really
usable :-(
Stefan