Обсуждение: Getting FATAL: terminating connection due to administrator command
Hi Since some days we are getting the above message. The system is a current CentOS 5.5, x86_64, Postgresql 8.4 as it comes with the packages postgresql84, postgresql84-libsetc. PostGIS is enabled, as it comes from http://www.argeo.org/linux/argeo-el. The error message appears from time to time. The exact same request, coming from a PHP applications, sometimes works, sometimesfails. This happens in different points of our applications, tipically, but not only, when large data portions arequeried, as in geometric queries, using PostGIS. The server is only slightly loaded. Also in the PostgreSQL logs we get: FATAL: terminating connection due to administrator command repeated multiple times. The server is from Dell, Dell's hardware monitoring, OpenManage, says that the hardware, in particular memory and disk, areok. We do have a nearly identical server, with the identical version of PostgreSQL/PostGIS, but that was last updated one ortwo months ago and which is intensly used as our testing and development server, which never gave us the same error message. Where could I start to troubleshoot this problem. Peter Hopfgartner
On Wed, Sep 15, 2010 at 02:55:39PM +0200, Peter Hopfgartner wrote: > Where could I start to troubleshoot this problem. First with staff, then with unauthorized access, then with failover software. Karsten -- GPG key ID E4071346 @ wwwkeys.pgp.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
Peter Hopfgartner <peter.hopfgartner@r3-gis.com> writes: > Since some days we are getting the above message. > Also in the PostgreSQL logs we get: > FATAL: terminating connection due to administrator command This is a result of something sending SIGTERM to the backend process. I have heard reports of "load management" software that SIGTERM's processes more or less at random whenever it decides the system is overloaded. If you have any such junkware installed on your server, try disabling it. > The server is from Dell, Dell's hardware monitoring, OpenManage, says that the hardware, in particular memory and disk,are ok. Never dealt with OpenManage before, but you should cast a wary eye upon any Dell-specific software on the machine. This behavior is definitely not normal for Unix systems, so you need to look for nonstandard software (and what's more, nonstandard software running with root privileges, else it couldn't SIGTERM postgres processes). regards, tom lane
On 15/09/2010 10:07 PM, Tom Lane wrote: >> The server is from Dell, Dell's hardware monitoring, OpenManage, says that the hardware, in particular memory and disk,are ok. > > Never dealt with OpenManage before, but you should cast a wary eye > upon any Dell-specific software on the machine. (A bit of a digression, but): Personally I'd suggest being wary of any software supplied by the entity that will be responsible for the costs of any warranty work. They won't be at *all* sad if their software deflects blame and you don't discover a fault until your server is out of warranty. I've seen enough HDD vendor utilities report that a disk is just peachy, thanks, when it's developing and reallocating bad sectors at a rate of one every few minutes. ("Hey, you didn't need that boot block, I've allocated you a shiny new one full of zeroes that's just as good.") The S.M.A.R.T. "health check" tends to say everything's fine, too ... but if you examine the fine print in the vendor attributes you see very high reallocated sector counts, ECC error levels, and other signs of a dying disk. I see this with so-called "enterprise" disks, not just consumer SATA drives. HDD vendors are certainly a particularly bad case, but nonetheless - don't trust vendor diagnostic software in general. If it says the device is broken I'll believe it because I trust them to make sure it won't report expensive false positives - but if it says it's OK I'll merely consider it not proven broken yet. False negatives work in their favour. Find 3rd party diagnostic tools where possible, and where not possible don't trust the overall health assessment provided by the vendor tools, dig into the fine print in the diagnostics and see what the details are like. For hard disks, smartctl from smartmontools is a lifesaver. Your issue doesn't sound HDD related, but it's worth mentioning for the future. -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/