Michael Bommarito <michael@bommaritollc.com> writes:
> Hello Michael,
> Here is the offending query and gdb session/stacktrace output. Please
> let me know if we can provide anything else from gdb or logs that can be
> anonymized.
> *Query:*
> 2015-07-11 12:57:41 UTC [12803-7] LOG: server process (PID 20696) was
> terminated by signal 11: Segmentation fault
> 2015-07-11 12:57:41 UTC [12803-8] DETAIL: Failed process was running:
> SELECT COUNT(*) FROM pg_stat_activity WHERE pid <> pg_backend_pid()
I tried tracing through the logic using the line numbers shown in this
stack trace, and soon decided that they didn't seem to match the query
you show above --- but on closer look, that's to be expected, because
this core is evidently from PID 16028:
> [New LWP 16028]
If you can identify the exact query that 16028 was running from your logs,
that would be helpful. Or, if you still have this same core file laying
about, "p debug_query_string" in gdb would probably be a more trustworthy
guide than trying to match up log entries.
>> That would be nice. I have let pgbench -C run for one hour with select
>> * from pg_stat_activity running every second (\watch 1) in parallel
>> but I could not reproduce the issue on HEAD.
Yeah, I tried similar experiments without result. Presumably there is
some other triggering condition here, but it's hard to guess what.
I tried things like doing VACUUM FULL on pg_database and pg_authid
to force replan cycles on the view, but no crash.
regards, tom lane