Call me stupid if you must, but I've tried both tcp_keepalive_xxx parameters and PQcancel() to abort a query to a remote server to which I've lost the network connection with no change in results. It appears to take about 15 minutes before the query gives up. I assume that must be due to some other system parameter that I cannot find. I can be very happy if I can shorten the 15 minute delay to about 30 to 60 seconds or have something else get me out the query in which I am stuck.
Specifics below. Anything you need to know, just ask...
Thanks,
Steve
Running RedHat Enterprise Linux 5.x with PostgreSQL 8.4.
My system TCP keep alive parameters are:
/proc/sys/net/ipv4/tcp_keepalive_time 15
/proc/sys/net/ipv4/tcp_keepalive_intvl 5
/proc/sys/net/ipv4/tcp_keepalive_probes 5
My postgresql.conf file has identical parameters:
tcp_keepalives_idle = 15 # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 5 # TCP_KEEPINTVL, in seconds;
tcp_keepalives_count = 3 # TCP_KEEPCNT;
The code I used to cancel the query was:
char errBuf[ 256 ];
PGcancel* pgCancel = PQgetCancel( pConn );
if( pgCancel != NULL )
{
int rc = PQcancel( pgCancel, errBuf, sizeof( errBuf ) );
PQfreeCancel( pgCancel );
cout << ": rc[" << rc << "] errBuf[" << errBuf << "]" << endl );
}
which returned the following:
rc[0] errBuf[PQcancel() -- connect() failed: No route to host]
I know I have no connection! The ping to the server failed, thus I called PQCancel() to get the query to return, so I could continue on about my business.