Обсуждение: Segmentation fault

Поиск
Список
Период
Сортировка

Segmentation fault

От
Amod Pandey
Дата:
Server stopped due to Segmentation Fault. Server was running successfully for an year.

PostgreSQL: 9.0.3

from /var/log/messages

Jul 18 19:00:03 ip-10-136-22-193 kernel: [18643442.660032] postgres[6818]: segfault at 170a8c6f ip 000000000044c94d sp 00007fff9fee5b80 error 4 in postgres[400000+495000]

from pg log

LOG:  server process (PID 6818) was terminated by signal 11: Segmentation fault
LOG:  terminating any other active server processes

Please suggest if there is a way to find out the issue.

Suggestions to avoid.

Regards
Amod

Re: Segmentation fault

От
Craig Ringer
Дата:
On 07/19/2012 12:37 AM, Amod Pandey wrote:
Server stopped due to Segmentation Fault. Server was running successfully for an year.

PostgreSQL: 9.0.3

from /var/log/messages

Jul 18 19:00:03 ip-10-136-22-193 kernel: [18643442.660032] postgres[6818]: segfault at 170a8c6f ip 000000000044c94d sp 00007fff9fee5b80 error 4 in postgres[400000+495000]

from pg log

LOG:  server process (PID 6818) was terminated by signal 11: Segmentation fault
LOG:  terminating any other active server processes

Please suggest if there is a way to find out the issue.

Did the crash produce a core file ?

You haven't mentioned what Linux distro or kernel version you're on, and defaults vary. Look in your PostgreSQL datadir and see if there are any files with "core" in the name.

Unfortunately most Linux distros default to not producing core files. Without a core file it'll be nearly impossible because the segfault message reported by the kernel only contains the instruction pointer and stack pointer. The stack pointer is invalid and useless without a core file, and with address space layout randomisation active the instruction pointer offsets are all randomised for each execution, so the ip doesn't tell you much on ASLR systems either.

If you can show more of the PostgreSQL logs from around the incident that would possibly be helpful.

--
Craig Ringer

Re: Segmentation fault

От
Craig Ringer
Дата:
On 07/19/2012 01:52 PM, Amod Pandey wrote:
Thank you Craig for explaining in such a detail. I am adding more information and would see what more I can add,

$ulimit -a
core file size          (blocks, -c) 0

So I assume there to be no core dump file.
Quite likely. Limits are inherited down process trees, so there's no guarantee that PostgreSQL's ulimit also prevented core file generation. However I haven't seen any distro configure a non-zero ulimit for PostgreSQL or other system services explicitly, so it's pretty darn likely to be zero, though.

Just check for a core file in the PostgreSQL data dir. If there is one, the Pg ulimit obviously wasn't zero. If there isn't, then given that Pg's working directory is always the datadir, chances are the ulimit prevented a core dump.


If I set 'ulimit -c unlimited' will it generate core dump if there is another occurrence. Do I need to restart postgres for this to take effect.
You would need to put this command in the PostgreSQL startup scripts *then* restart PostgreSQL.

It can be easier to configure it globally for the server. How to do this depends a bit on your distro and version; Google will help - "enable core dumps <distro>" or "change ulimit <distro>" for example.


Linux distros
-------------------
Linux ip-xx-xx-xx-xx 2.6.35.11-83.9.amzn1.x86_64 #1 SMP Sat Feb 19 23:42:04 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Um, that's not a distro, that's a kernel. I'm assuming it's an Amazon cloud hosted machine by the kernel, and since Ubuntu (and IIRC Debian) puts its name in the uname version string it's probably RHEL/CentOS/Fedora.

--
Craig Ringer

Re: Segmentation fault

От
Amod Pandey
Дата:
Thank you Craig for explaining in such a detail. I am adding more information and would see what more I can add,

$ulimit -a
core file size          (blocks, -c) 0

So I assume there to be no core dump file.

If I set 'ulimit -c unlimited' will it generate core dump if there is another occurrence. Do I need to restart postgres for this to take effect.

Linux distros
-------------------
Linux ip-xx-xx-xx-xx 2.6.35.11-83.9.amzn1.x86_64 #1 SMP Sat Feb 19 23:42:04 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

I will see if there are queries which I can share.

Regards
Amod

On Thu, Jul 19, 2012 at 9:20 AM, Craig Ringer <ringerc@ringerc.id.au> wrote:
On 07/19/2012 12:37 AM, Amod Pandey wrote:
Server stopped due to Segmentation Fault. Server was running successfully for an year.

PostgreSQL: 9.0.3

from /var/log/messages

Jul 18 19:00:03 ip-10-136-22-193 kernel: [18643442.660032] postgres[6818]: segfault at 170a8c6f ip 000000000044c94d sp 00007fff9fee5b80 error 4 in postgres[400000+495000]

from pg log

LOG:  server process (PID 6818) was terminated by signal 11: Segmentation fault
LOG:  terminating any other active server processes

Please suggest if there is a way to find out the issue.

Did the crash produce a core file ?

You haven't mentioned what Linux distro or kernel version you're on, and defaults vary. Look in your PostgreSQL datadir and see if there are any files with "core" in the name.

Unfortunately most Linux distros default to not producing core files. Without a core file it'll be nearly impossible because the segfault message reported by the kernel only contains the instruction pointer and stack pointer. The stack pointer is invalid and useless without a core file, and with address space layout randomisation active the instruction pointer offsets are all randomised for each execution, so the ip doesn't tell you much on ASLR systems either.

If you can show more of the PostgreSQL logs from around the incident that would possibly be helpful.

--
Craig Ringer