Segmentation Fault

Поиск

Список

Период

Сортировка

От	Benson Jin
Тема	Segmentation Fault
Дата	11 июня 2012 г. 01:00:41
Msg-id	843420470.254405.1339385687611.JavaMail.root@troo.com обсуждение исходный текст
Ответы	Re: Segmentation Fault Re: Segmentation Fault
Список	pgsql-general

Дерево обсуждения

Hi All,

We are having a problem with our streaming replication read only node. It has crashed a few times with a couple of different reasons, mostly "segmentation fault". The latest log are listed below:

2012-05-30 23:56:37.385 UTC::: LOG: server process (PID 19476) was terminated by signal 11: Segmentation fault

2012-05-30 23:56:37.385 UTC::: LOG: terminating any other active server processes

2012-05-30 23:56:37.385 UTC:10.43.6.61:webmaster:panorama WARNING: terminating connection because of crash of another server process

2012-05-30 23:56:37.385 UTC:10.43.6.61:webmaster:panorama DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2012-05-30 23:56:37.385 UTC:10.43.6.61:webmaster:panorama HINT: In a moment you should be able to reconnect to the database and repeat your command.

2012-05-30 23:56:37.385 UTC:10.43.6.81:webmaster:panorama WARNING: terminating connection because of crash of another server process

2012-05-30 23:56:37.385 UTC:10.43.6.81:webmaster:panorama DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

2012-05-30 23:56:37.385 UTC:10.43.6.81:webmaster:panorama HINT: In a moment you should be able to reconnect to the database and repeat your command.

2012-05-30 23:56:37.385 UTC:10.43.6.81:webmaster:panorama WARNING: terminating connection because of crash of another server process

2012-05-30 23:56:37.385 UTC:10.43.6.81:webmaster:panorama HINT: In a moment you should be able to reconnect to the database and repeat your command.

2012-05-30 23:56:37.575 UTC:10.43.6.81:webmaster:panorama FATAL: the database system is in recovery mode

Our setup:

2x physical server - Dell PE R815, 64GB ECC RAM, 2 CPUs (12 cores each), storing pgsql data on SAN backed volumes.

CentOS 5.6

PostgreSQL 9.0.8, compiled *without* int64 datetime.

Both servers are identically configured (or at least as much as we could ensure)

One is master, another is streaming read-only node.

The master runs two instances of postgreSQL, where the slave runs 5 instances of postgreSQL. 2 out of 5 are streaming replication from the master, rest 3 are streaming replication from other DB nodes. Those 2 instances serves clients as Read Only. The master node never had any crash so far. However, the 2 instances on slave have crashed 3 times by now, 1 time on one readonly instance, twice on another readonly instance. Above log was generated from one of the instances.

All three crashes happened when the database was doing vacuuming. we automatically purge some data every night, and run vacuum analyze right after that... Our the CPU load is generally on 40%-60% mark.

I have run a complete set of hardware diagnostics on the slave, with no faulty hardware detected. Can someone kindly shed some lights on me? I am not sure where to look into at this point....

Cheers,

Bo Jin

Operating/IT Manager
Troo Corporation [www.troo.com]
43 Auriga Drive, Suite 102, Ottawa, ON K2E 7Y8
Ph: +1 877.702.8766 x156
Fax: +1 855.726.8766

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Segmentation Fault