Обсуждение: Crash on attempt to connect to nonstarted server
I get a crash on win32 when connecting to a server that's not started.
In fe-connect.c, we have:
display_host_addr = (conn->pghostaddr == NULL) && (strcmp(conn->pghost, host_addr) != 0);
In my case, conn->pghost is NULL at this point, as is
conn->pghostaddr. Thus, it crashes in strcmp().
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Magnus Hagander wrote:
> I get a crash on win32 when connecting to a server that's not started.
> In fe-connect.c, we have:
>
> display_host_addr = (conn->pghostaddr == NULL) &&
> (strcmp(conn->pghost, host_addr) != 0);
>
> In my case, conn->pghost is NULL at this point, as is
> conn->pghostaddr. Thus, it crashes in strcmp().
I have researched this with Magnus, and was able to reproduce the
failure. It happens only on Win32 because that is missing unix-domain
sockets so "" maps to localhost, which is an IP address. I have applied
the attached patch. The new output is:
$ psql test
psql: could not connect to server: Connection refused
Is the server running on host "???" and accepting
TCP/IP connections on port 5432?
Note the "???". This happens because the mapping of "" to localhost
happens below the libpq library variable level.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index b1523a6..8d9400b 100644
*** /tmp/pgrevert.7311/PXMjec_fe-connect.c Thu Dec 16 08:36:11 2010
--- src/interfaces/libpq/fe-connect.c Thu Dec 16 08:31:51 2010
*************** connectFailureMessage(PGconn *conn, int
*** 1031,1037 ****
strcpy(host_addr, "???");
display_host_addr = (conn->pghostaddr == NULL) &&
! (strcmp(conn->pghost, host_addr) != 0);
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("could not connect to server: %s\n"
--- 1031,1038 ----
strcpy(host_addr, "???");
display_host_addr = (conn->pghostaddr == NULL) &&
! (conn->pghost != NULL) &&
! (strcmp(conn->pghost, host_addr) != 0);
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("could not connect to server: %s\n"
Magnus Hagander <magnus@hagander.net> writes:
> I get a crash on win32 when connecting to a server that's not started.
> In fe-connect.c, we have:
> display_host_addr = (conn->pghostaddr == NULL) &&
> (strcmp(conn->pghost, host_addr) != 0);
> In my case, conn->pghost is NULL at this point, as is
> conn->pghostaddr. Thus, it crashes in strcmp().
[ scratches head... ] I seem to remember having decided that patch was
OK because what was there before already assumed conn->pghost would be
set. Under exactly what conditions could we get this far with neither
field being set?
regards, tom lane
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > I get a crash on win32 when connecting to a server that's not started.
> > In fe-connect.c, we have:
>
> > display_host_addr = (conn->pghostaddr == NULL) &&
> > (strcmp(conn->pghost, host_addr) != 0);
>
> > In my case, conn->pghost is NULL at this point, as is
> > conn->pghostaddr. Thus, it crashes in strcmp().
>
> [ scratches head... ] I seem to remember having decided that patch was
> OK because what was there before already assumed conn->pghost would be
> set. Under exactly what conditions could we get this far with neither
> field being set?
OK, sure, I can explain. What happens in libpq is that when no host
name is supplied, you get a default. On Unix, that is unix-domain
sockets, but on Win32, that is localhost, meaning IP.
The problem is that the mapping of "" maps to localhost in
connectDBStart(), specificially here:
#ifdef HAVE_UNIX_SOCKETS /* pghostaddr and pghost are NULL, so use Unix domain socket */ node = NULL;
hint.ai_family = AF_UNIX; UNIXSOCK_PATH(portstr, portnum, conn->pgunixsocket);#else /* Without Unix
sockets,default to localhost instead */ node = "localhost"; hint.ai_family = AF_UNSPEC;#endif /*
HAVE_UNIX_SOCKETS*/
The problem is that this is setting up the pg_getaddrinfo_all() call,
and is _not_ setting any of the libpq variables that we actually test in
the error message section that had the bug.
The 9.0 code has a convoluted test in the appendPQExpBuffer statement:
appendPQExpBuffer(&conn->errorMessage, libpq_gettext("could not connect to server: %s\n"
"\tIs the server running on host \"%s\" and accepting\n"
"\tTCP/IPconnections on port %s?\n"), SOCK_STRERROR(errorno, sebuf, sizeof(sebuf)),
conn->pghostaddr ? conn->pghostaddr : (conn->pghost
? conn->pghost : "???"), conn->pgport);
but it clearly expects either or both could be NULL. That code is
actually still in appendPQExpBuffer() in git master.
-- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB
http://enterprisedb.com
+ It's impossible for everything to be true. +