On Mon, Jul 27, 2009 at 4:31 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Lonni J Friedman <netllama@gmail.com> writes:
>> On Sun, Jul 26, 2009 at 1:23 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>>> pg_stat_activity should be reasonably trustworthy, modulo the fact that
>>> the display might be a fraction of a second out-of-date.
>
>> Hrmm, that's not what I'm seeing. I'm finding that connections
>> continue to appear in the table long after I've terminated a remote
>> pqsl connection. I'm talking minutes or even hours.
>
> What are you doing to "terminate" these remote connections? What it
> sounds like is the connected server process isn't being told about the
> termination, and so it sits there waiting for input that will never
> come. We do enable TCP keepalive if available, so unless your server
> is running a seriously obsolete OS, it will eventually figure out the
> client is gone --- but that takes order-of-hours with the standard TCP
> timeout settings.
Normally, just quitting from psql, but as part of today's experiment I
rebooted the system that the table claimed was still connected. The
server is running Linux with a reasonably recent 2.6.x kernel.
>
> Between that and your unreasonably large number of TIME_WAIT
> connections, it definitely seems like you've got TCP-level connection
> reliability problems. TIME_WAIT state should go away pretty fast too
> if things are working properly at the network level. I wonder whether
> you have a router that is dropping connections it thinks are idle.
> Beyond that my TCP expertise does not extend.
The TIME_WAIT entries do go away fairly quickly, but that's not what
I'm looking at now. I'm talking about the content of the
pg_stat_activity table.