Обсуждение: Getting FATAL: terminating connection due to administrator command

Поиск
Список
Период
Сортировка

Getting FATAL: terminating connection due to administrator command

От
Peter Hopfgartner
Дата:
Hi

Since some days we are getting the above message.

The system is a current CentOS 5.5, x86_64, Postgresql 8.4 as it comes with the packages postgresql84,
postgresql84-libsetc. PostGIS is enabled, as it comes from http://www.argeo.org/linux/argeo-el.
 

The error message appears from time to time. The exact same request, coming from a PHP applications, sometimes works,
sometimesfails. This happens in different points of our applications, tipically, but not only, when large data portions
arequeried, as in geometric queries, using PostGIS.
 

The server is only slightly loaded.

Also in the PostgreSQL logs we get:

FATAL:  terminating connection due to administrator command

repeated multiple times.

The server is from Dell, Dell's hardware monitoring, OpenManage, says that the hardware, in particular memory and disk,
areok.
 

We do have a nearly identical server, with the identical version of PostgreSQL/PostGIS, but that was last updated one
ortwo months ago and which is intensly used as our testing and development server, which never gave us the same error
message.

Where could I start to troubleshoot this problem.

Peter Hopfgartner


Re: Getting FATAL: terminating connection due to administrator command

От
Karsten Hilbert
Дата:
On Wed, Sep 15, 2010 at 02:55:39PM +0200, Peter Hopfgartner wrote:

> Where could I start to troubleshoot this problem.

First with staff, then with unauthorized access, then with
failover software.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Re: Getting FATAL: terminating connection due to administrator command

От
Tom Lane
Дата:
Peter Hopfgartner <peter.hopfgartner@r3-gis.com> writes:
> Since some days we are getting the above message.
> Also in the PostgreSQL logs we get:
> FATAL:  terminating connection due to administrator command

This is a result of something sending SIGTERM to the backend process.

I have heard reports of "load management" software that SIGTERM's
processes more or less at random whenever it decides the system is
overloaded.  If you have any such junkware installed on your server,
try disabling it.

> The server is from Dell, Dell's hardware monitoring, OpenManage, says that the hardware, in particular memory and
disk,are ok. 

Never dealt with OpenManage before, but you should cast a wary eye
upon any Dell-specific software on the machine.  This behavior is
definitely not normal for Unix systems, so you need to look for
nonstandard software (and what's more, nonstandard software running with
root privileges, else it couldn't SIGTERM postgres processes).

            regards, tom lane

Re: Getting FATAL: terminating connection due to administrator command

От
Craig Ringer
Дата:
On 15/09/2010 10:07 PM, Tom Lane wrote:

>> The server is from Dell, Dell's hardware monitoring, OpenManage, says that the hardware, in particular memory and
disk,are ok. 
>
> Never dealt with OpenManage before, but you should cast a wary eye
> upon any Dell-specific software on the machine.

(A bit of a digression, but):

Personally I'd suggest being wary of any software supplied by the entity
that will be responsible for the costs of any warranty work. They won't
be at *all* sad if their software deflects blame and you don't discover
a fault until your server is out of warranty.

I've seen enough HDD vendor utilities report that a disk is just peachy,
thanks, when it's developing and reallocating bad sectors at a rate of
one every few minutes. ("Hey, you didn't need that boot block, I've
allocated you a shiny new one full of zeroes that's just as good.") The
S.M.A.R.T. "health check" tends to say everything's fine, too ... but if
you examine the fine print in the vendor attributes you see very high
reallocated sector counts, ECC error levels, and other signs of a dying
disk. I see this with so-called "enterprise" disks, not just consumer
SATA drives.

HDD vendors are certainly a particularly bad case, but nonetheless -
don't trust vendor diagnostic software in general. If it says the device
is broken I'll believe it because I trust them to make sure it won't
report expensive false positives - but if it says it's OK I'll merely
consider it not proven broken yet. False negatives work in their favour.

Find 3rd party diagnostic tools where possible, and where not possible
don't trust the overall health assessment provided by the vendor tools,
dig into the fine print in the diagnostics and see what the details are
like.

For hard disks, smartctl from smartmontools is a lifesaver. Your issue
doesn't sound HDD related, but it's worth mentioning for the future.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/