On Thu, Aug 9, 2018 at 11:23 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The patch that taught libpq about allowing multiple target hosts
> modified connectDBComplete() with the intent of making the
> connect_timeout (if specified) apply per-host, not to the complete
> connection attempt. It did not do a very good job though, because
> the timeout only gets reset when connectDBComplete() itself detects
> a timeout. If PQconnectPoll advances to a new host due to some
> other cause, the previous host's timeout continues to run, possibly
> causing a premature timeout failure for the new one.
Oops.
> Another thing that I find pretty strange is that it is coded so that,
> in event of a timeout detection by connectDBComplete, we give up on the
> current connhost entry and advance to the next host, ignoring any
> additional addresses we might have for the current hostname. This seems
> at best poorly thought through. There's no good reason for libpq to
> assume that all the addresses returned by DNS point at the same machine,
> or share the same network failure points in between.
Hmm, well, that was deliberate, but maybe ill-advised. I guess I was
worried about making users wait for a long time to no gain, but your
points are valid, too.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company