Обсуждение: Brittleness in regression test setup

Поиск
Список
Период
Сортировка

Brittleness in regression test setup

От
Peter Eisentraut
Дата:
I have experienced some brittleness in the regression test setup that 
causes the tests to be run against a different server instance or fail 
in confusing ways when you have multiple instances running.

For some historic reasons, I have my local scripts set up so that they 
build development instances using the hardcoded port 65432.  This will 
cause a temp install regression test to attempt to use port 565432 which 
will be rejected silently by pg_regress, which will then use its 
hardcoded default 65432 (no relation to my 65432).  If I have some other 
instance already running on 65432, then this will fail in non-reassuring 
ways such as

============== removing existing temp installation    ==============
============== creating temporary installation        ==============
============== initializing database system           ==============
============== starting postmaster                    ==============
running on port 65432 with pid 94178
============== creating database "regression"         ==============
ERROR:  database "regression" already exists

It evidently failed to realize that there is another postmaster already 
running at that port and just ran its test setup routines against that 
other instance.

If there is no database named "regression" on that other instance, then 
it will happily go ahead and run its full test suite against that other 
instance.

I see two problems here:

1) It fails to realize that it could not start its own temp instance 
when another instance is already running.

2) It ignores the port specification almost silently.

Since ports above 49152 are for private use, I think it is valid to 
configure test instances in that port range, but our regression test 
setup does not handle that port range very well.

So even if I configured my local scripts to use ports that are all 
different from each other and from 65432, this problem would still exist.

So, also,

2a) It has an undocumented hardcoded default port.

Any thoughts on how to fix this?


Re: Brittleness in regression test setup

От
Andrew Dunstan
Дата:

Peter Eisentraut wrote:
>
> So even if I configured my local scripts to use ports that are all 
> different from each other and from 65432, this problem would still exist.

Only if you choose the private port to be above 9999. The standard 
buildfarm config file contains a warning against using such ports for 
exactly this reason. Is using a private 4-digit port terribly difficult 
for you?

>
> So, also,
>
> 2a) It has an undocumented hardcoded default port.
>
> Any thoughts on how to fix this?
>

a) document it

b) make it a lot noisier if it falls back on 65432.

cheers

andrew




Re: Brittleness in regression test setup

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> For some historic reasons, I have my local scripts set up so that they 
> build development instances using the hardcoded port 65432.  This will 
> cause a temp install regression test to attempt to use port 565432 which 
> will be rejected silently by pg_regress, which will then use its 
> hardcoded default 65432 (no relation to my 65432).

One thing we should do is have pg_regress.c, not the Makefile,
select the default port to use.  The concatenate-5 behavior is
just not intelligent enough.
        regards, tom lane


Re: Brittleness in regression test setup

От
Peter Eisentraut
Дата:
Peter Eisentraut wrote:
> ============== removing existing temp installation    ==============
> ============== creating temporary installation        ==============
> ============== initializing database system           ==============
> ============== starting postmaster                    ==============
> running on port 65432 with pid 94178
> ============== creating database "regression"         ==============
> ERROR:  database "regression" already exists
>
> It evidently failed to realize that there is another postmaster already
> running at that port and just ran its test setup routines against that
> other instance.

On this matter, I noticed that pg_regress doesn't do anything to clean
up its child processes.  I see zombies lying around on Linux and Mac OS
when the postmaster dies.  (And the zombie is exactly the pid 94178 it
announced in the example above.)

I played around a little with signal handling to collect the dying
postmaster and report and error; see attached rough patch.  I'm not
exactly understanding how this works though.  I would expect lots of
psql zombies for example, because those go through the same
spawn_process() call, but I'm not seeing any.  Also, the sleep() call in
my patch is necessary to get some effect.  Anyone else go a clue on what
to do here?

Index: src/test/regress/GNUmakefile
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/GNUmakefile,v
retrieving revision 1.75
diff -u -3 -p -r1.75 GNUmakefile
--- src/test/regress/GNUmakefile    1 Oct 2008 22:38:57 -0000    1.75
+++ src/test/regress/GNUmakefile    25 Nov 2008 13:44:05 -0000
@@ -47,6 +47,8 @@ EXTRADEFS = '-DHOST_TUPLE="$(host_tuple)
     '-DSHELLPROG="$(SHELL)"' \
     '-DDLSUFFIX="$(DLSUFFIX)"'

+override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
+
 ##
 ## Prepare for tests
 ##
@@ -55,7 +57,7 @@ EXTRADEFS = '-DHOST_TUPLE="$(host_tuple)

 all: submake-libpgport pg_regress$(X)

-pg_regress$(X): pg_regress.o pg_regress_main.o
+pg_regress$(X): pg_regress.o pg_regress_main.o pqsignal.o
     $(CC) $(CFLAGS) $^ $(LDFLAGS) $(LIBS) -o $@

 # dependencies ensure that path changes propagate
@@ -65,6 +67,10 @@ pg_regress.o: pg_regress.c $(top_builddi
 $(top_builddir)/src/port/pg_config_paths.h: $(top_builddir)/src/Makefile.global
     $(MAKE) -C $(top_builddir)/src/port pg_config_paths.h

+pqsignal.c: % : $(top_srcdir)/src/interfaces/libpq/%
+    rm -f $@ && $(LN_S) $< .
+
+
 install: all installdirs
     $(INSTALL_PROGRAM) pg_regress$(X) '$(DESTDIR)$(pgxsdir)/$(subdir)/pg_regress$(X)'

@@ -172,7 +178,7 @@ bigcheck: all

 clean distclean maintainer-clean: clean-lib
 # things built by `all' target
-    rm -f $(OBJS) refint$(DLSUFFIX) autoinc$(DLSUFFIX) pg_regress_main.o pg_regress.o pg_regress$(X)
+    rm -f $(OBJS) refint$(DLSUFFIX) autoinc$(DLSUFFIX) pg_regress_main.o pg_regress.o pg_regress$(X) pqsignal.c
 # things created by various check targets
     rm -f $(output_files) $(input_files)
     rm -rf testtablespace
Index: src/test/regress/pg_regress.c
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/pg_regress.c,v
retrieving revision 1.51
diff -u -3 -p -r1.51 pg_regress.c
--- src/test/regress/pg_regress.c    25 Nov 2008 11:49:35 -0000    1.51
+++ src/test/regress/pg_regress.c    25 Nov 2008 13:44:05 -0000
@@ -31,6 +31,7 @@

 #include "getopt_long.h"
 #include "pg_config_paths.h"
+#include "libpq/pqsignal.h"

 /* for resultmap we need a list of pairs of strings */
 typedef struct _resultmap
@@ -277,6 +278,8 @@ stop_postmaster(void)
         fflush(stdout);
         fflush(stderr);

+        postmaster_pid = INVALID_PID;
+
         snprintf(buf, sizeof(buf),
                  SYSTEMQUOTE "\"%s/pg_ctl\" stop -D \"%s/data\" -s -m fast" SYSTEMQUOTE,
                  bindir, temp_install);
@@ -1803,6 +1806,29 @@ make_absolute_path(const char *in)
 }

 static void
+sigchld_handler(int signum)
+{
+    pid_t pid;
+    int status;
+    int save_errno;
+
+    save_errno = errno;
+    while (1)
+    {
+        pid = waitpid(postmaster_pid, &status, WNOHANG);
+        if (pid <= 0)
+            break;
+
+        if (pid == postmaster_pid)
+        {
+            fprintf(stderr, "postmaster died\n");
+            exit(2);
+        }
+    }
+    errno = save_errno;
+}
+
+static void
 help(void)
 {
     printf(_("PostgreSQL regression test driver\n"));
@@ -2100,6 +2126,7 @@ regression_main(int argc, char *argv[],
                  debug ? " -d 5" : "",
                  hostname ? hostname : "",
                  outputdir);
+        pqsignal(SIGCHLD, sigchld_handler);
         postmaster_pid = spawn_process(buf);
         if (postmaster_pid == INVALID_PID)
         {
@@ -2116,6 +2143,7 @@ regression_main(int argc, char *argv[],
         snprintf(buf, sizeof(buf),
                  SYSTEMQUOTE "\"%s/psql\" -X postgres <%s 2>%s" SYSTEMQUOTE,
                  bindir, DEVNULL, DEVNULL);
+        sleep(5);
         for (i = 0; i < 60; i++)
         {
             /* Done if psql succeeds */

Re: Brittleness in regression test setup

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> I played around a little with signal handling to collect the dying 
> postmaster and report and error; see attached rough patch.  I'm not 
> exactly understanding how this works though.  I would expect lots of 
> psql zombies for example, because those go through the same 
> spawn_process() call, but I'm not seeing any.

That's because wait_for_tests wait()s for them.

AFAICS the only way you'd end up with a zombie postmaster is if pg_ctl
stop fails, but I'm failing to understand why that's likely to happen.
        regards, tom lane


Re: Brittleness in regression test setup

От
Peter Eisentraut
Дата:
Tom Lane wrote:
> One thing we should do is have pg_regress.c, not the Makefile,
> select the default port to use.  The concatenate-5 behavior is
> just not intelligent enough.

How about something like this, constructing a port number from the
version and a timestamp?  We could also take 2 more bits from the
version and give it to the timestamp, which would make this a bit safer,
I think.


Index: src/test/regress/GNUmakefile
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/GNUmakefile,v
retrieving revision 1.75
diff -u -3 -p -r1.75 GNUmakefile
--- src/test/regress/GNUmakefile    1 Oct 2008 22:38:57 -0000    1.75
+++ src/test/regress/GNUmakefile    25 Nov 2008 15:14:19 -0000
@@ -14,9 +14,6 @@ subdir = src/test/regress
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global

-# port number for temp-installation test postmaster
-TEMP_PORT = 5$(DEF_PGPORT)
-
 # file with extra config for temp build
 TEMP_CONF =
 ifdef TEMP_CONFIG
@@ -144,7 +141,7 @@ tablespace-setup:
 pg_regress_call = ./pg_regress --inputdir=$(srcdir) --dlpath=. --multibyte=$(MULTIBYTE) --load-language=plpgsql
$(NOLOCALE)

 check: all
-    $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir) --temp-port=$(TEMP_PORT)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF) 
+    $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF) 

 installcheck: all
     $(pg_regress_call) --psqldir=$(PSQLDIR) --schedule=$(srcdir)/serial_schedule
@@ -163,7 +160,7 @@ bigtest: all
     $(pg_regress_call) --psqldir=$(PSQLDIR) --schedule=$(srcdir)/serial_schedule numeric_big

 bigcheck: all
-    $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir) --temp-port=$(TEMP_PORT)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) numeric_big 
+    $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) numeric_big 


 ##
Index: src/test/regress/pg_regress.c
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/pg_regress.c,v
retrieving revision 1.50
diff -u -3 -p -r1.50 pg_regress.c
--- src/test/regress/pg_regress.c    20 Nov 2008 15:03:39 -0000    1.50
+++ src/test/regress/pg_regress.c    25 Nov 2008 15:14:20 -0000
@@ -83,10 +83,9 @@ static _stringlist *extra_tests = NULL;
 static char *temp_install = NULL;
 static char *temp_config = NULL;
 static char *top_builddir = NULL;
-static int    temp_port = 65432;
 static bool nolocale = false;
 static char *hostname = NULL;
-static int    port = -1;
+static int    port = 0;
 static char *dlpath = PKGLIBDIR;
 static char *user = NULL;
 static _stringlist *extraroles = NULL;
@@ -733,7 +732,7 @@ initialize_environment(void)
         else
             unsetenv("PGHOST");
         unsetenv("PGHOSTADDR");
-        if (port != -1)
+        if (port)
         {
             char        s[16];

@@ -789,7 +788,7 @@ initialize_environment(void)
             doputenv("PGHOST", hostname);
             unsetenv("PGHOSTADDR");
         }
-        if (port != -1)
+        if (port)
         {
             char        s[16];

@@ -1821,7 +1820,6 @@ help(void)
     printf(_("Options for \"temp-install\" mode:\n"));
     printf(_("  --no-locale               use C locale\n"));
     printf(_("  --top-builddir=DIR        (relative) path to top level build directory\n"));
-    printf(_("  --temp-port=PORT          port number to start temp postmaster on\n"));
     printf(_("  --temp-config=PATH        append contents of PATH to temporary config\n"));
     printf(_("\n"));
     printf(_("Options for using an existing installation:\n"));
@@ -1859,7 +1857,6 @@ regression_main(int argc, char *argv[],
         {"temp-install", required_argument, NULL, 9},
         {"no-locale", no_argument, NULL, 10},
         {"top-builddir", required_argument, NULL, 11},
-        {"temp-port", required_argument, NULL, 12},
         {"host", required_argument, NULL, 13},
         {"port", required_argument, NULL, 14},
         {"user", required_argument, NULL, 15},
@@ -1933,15 +1930,6 @@ regression_main(int argc, char *argv[],
             case 11:
                 top_builddir = strdup(optarg);
                 break;
-            case 12:
-                {
-                    int            p = atoi(optarg);
-
-                    /* Since Makefile isn't very bright, check port range */
-                    if (p >= 1024 && p <= 65535)
-                        temp_port = p;
-                }
-                break;
             case 13:
                 hostname = strdup(optarg);
                 break;
@@ -1982,8 +1970,14 @@ regression_main(int argc, char *argv[],
         optind++;
     }

-    if (temp_install)
-        port = temp_port;
+    if (temp_install && !port)
+        /*
+         * To reduce chances of interference with parallel
+         * installations, use a port number starting in the private
+         * range (49152-65535) calculated from version number and a
+         * time stamp.
+         */
+        port = 0xC000 | (((PG_VERSION_NUM / 100) << 4) & 0x3FF0) | ((int) time(NULL) & 0xF);

     inputdir = make_absolute_path(inputdir);
     outputdir = make_absolute_path(outputdir);
@@ -2157,7 +2151,7 @@ regression_main(int argc, char *argv[],
         postmaster_running = true;

         printf(_("running on port %d with pid %lu\n"),
-               temp_port, (unsigned long) postmaster_pid);
+               port, (unsigned long) postmaster_pid);
     }
     else
     {

Re: Brittleness in regression test setup

От
Alvaro Herrera
Дата:
Peter Eisentraut wrote:
> Tom Lane wrote:
>> One thing we should do is have pg_regress.c, not the Makefile,
>> select the default port to use.  The concatenate-5 behavior is
>> just not intelligent enough.
>
> How about something like this, constructing a port number from the  
> version and a timestamp?  We could also take 2 more bits from the  
> version and give it to the timestamp, which would make this a bit safer,  
> I think.

Is it possible to make it retry in case the chosen port is busy?  I
guess a simple check should suffice, ignoring the obvious race condition
that someone uses the port after you checked it was OK.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Brittleness in regression test setup

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane wrote:
>> One thing we should do is have pg_regress.c, not the Makefile,
>> select the default port to use.  The concatenate-5 behavior is
>> just not intelligent enough.

> How about something like this, constructing a port number from the 
> version and a timestamp?  We could also take 2 more bits from the 
> version and give it to the timestamp, which would make this a bit safer, 
> I think.

I'd vote for keeping the --temp-port option but not having the Makefile
use it.  Seems like it'd still be potentially useful for hand use of
pg_regress.

Also, like Alvaro I'm thinking that a retry is really needed.  As this
patch stands you'd be vulnerable to random, unrepeatable failures
anytime you picked a port that happened to be in use for something else.
        regards, tom lane


Re: Brittleness in regression test setup

От
Peter Eisentraut
Дата:
Tom Lane wrote:
> AFAICS the only way you'd end up with a zombie postmaster is if pg_ctl
> stop fails, but I'm failing to understand why that's likely to happen.

No, the zombies appear if the postmaster dies (briefly) after launch.


Re: Brittleness in regression test setup

От
Peter Eisentraut
Дата:
Alvaro Herrera wrote:
> Is it possible to make it retry in case the chosen port is busy?  I
> guess a simple check should suffice, ignoring the obvious race condition
> that someone uses the port after you checked it was OK.

Well, the whole point of this exercise was to avoid that.  If we had a 
way to do a "simple check", we might as well stick to the hardcoded port 
and count up from that or something.

The problem with doing the checking is that you have to emulate the 
complete postmaster logic for port numbers, listen addresses, Unix 
domain socket directories, etc.  That can become quite involved.

Then again, a simple way to avoid the issue altogether on platforms 
supporting Unix-domain sockets would be to run the test over Unix-domain 
sockets (which we do anyway) placed in a private directory.  How about that?


Re: Brittleness in regression test setup

От
Peter Eisentraut
Дата:
Tom Lane wrote:
> I'd vote for keeping the --temp-port option but not having the Makefile
> use it.  Seems like it'd still be potentially useful for hand use of
> pg_regress.

Sorry, I didn't document this fully.  The --temp-port option appears to 
be redundant with the --port option, so I figured we could drop the 
former and just use the latter for both the temp install and existing 
install cases.


Re: Brittleness in regression test setup

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> Then again, a simple way to avoid the issue altogether on platforms 
> supporting Unix-domain sockets would be to run the test over Unix-domain 
> sockets (which we do anyway) placed in a private directory.  How about that?

Then the brittleness is still there on Windows, only we'd probably get
confused and think it was a platform-specific bug.
        regards, tom lane


Re: Brittleness in regression test setup

От
Peter Eisentraut
Дата:
Peter Eisentraut wrote:
> Alvaro Herrera wrote:
>> Is it possible to make it retry in case the chosen port is busy?  I
>> guess a simple check should suffice, ignoring the obvious race condition
>> that someone uses the port after you checked it was OK.
>
> Well, the whole point of this exercise was to avoid that.  If we had a
> way to do a "simple check", we might as well stick to the hardcoded port
> and count up from that or something.

Well, duh, the checking is actually pretty simple.  We just try to
connect with psql to the candidate port number before starting our own
postmaster and see if anyone is already there.

Patch attached.  It solves my immediate problems nicely.
Index: src/test/regress/GNUmakefile
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/GNUmakefile,v
retrieving revision 1.75
diff -u -3 -p -r1.75 GNUmakefile
--- src/test/regress/GNUmakefile    1 Oct 2008 22:38:57 -0000    1.75
+++ src/test/regress/GNUmakefile    27 Nov 2008 10:27:36 -0000
@@ -14,9 +14,6 @@ subdir = src/test/regress
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global

-# port number for temp-installation test postmaster
-TEMP_PORT = 5$(DEF_PGPORT)
-
 # file with extra config for temp build
 TEMP_CONF =
 ifdef TEMP_CONFIG
@@ -144,7 +141,7 @@ tablespace-setup:
 pg_regress_call = ./pg_regress --inputdir=$(srcdir) --dlpath=. --multibyte=$(MULTIBYTE) --load-language=plpgsql
$(NOLOCALE)

 check: all
-    $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir) --temp-port=$(TEMP_PORT)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF) 
+    $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) $(TEMP_CONF) 

 installcheck: all
     $(pg_regress_call) --psqldir=$(PSQLDIR) --schedule=$(srcdir)/serial_schedule
@@ -163,7 +160,7 @@ bigtest: all
     $(pg_regress_call) --psqldir=$(PSQLDIR) --schedule=$(srcdir)/serial_schedule numeric_big

 bigcheck: all
-    $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir) --temp-port=$(TEMP_PORT)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) numeric_big 
+    $(pg_regress_call) --temp-install=./tmp_check --top-builddir=$(top_builddir)
--schedule=$(srcdir)/parallel_schedule$(MAXCONNOPT) numeric_big 


 ##
Index: src/test/regress/pg_regress.c
===================================================================
RCS file: /cvsroot/pgsql/src/test/regress/pg_regress.c,v
retrieving revision 1.53
diff -u -3 -p -r1.53 pg_regress.c
--- src/test/regress/pg_regress.c    26 Nov 2008 13:26:52 -0000    1.53
+++ src/test/regress/pg_regress.c    27 Nov 2008 10:27:36 -0000
@@ -83,10 +83,10 @@ static _stringlist *extra_tests = NULL;
 static char *temp_install = NULL;
 static char *temp_config = NULL;
 static char *top_builddir = NULL;
-static int    temp_port = 65432;
 static bool nolocale = false;
 static char *hostname = NULL;
 static int    port = -1;
+static bool port_specified_by_user = false;
 static char *dlpath = PKGLIBDIR;
 static char *user = NULL;
 static _stringlist *extraroles = NULL;
@@ -1844,7 +1844,7 @@ help(void)
     printf(_("Options for \"temp-install\" mode:\n"));
     printf(_("  --no-locale               use C locale\n"));
     printf(_("  --top-builddir=DIR        (relative) path to top level build directory\n"));
-    printf(_("  --temp-port=PORT          port number to start temp postmaster on\n"));
+    printf(_("  --port=PORT               start postmaster on PORT\n"));
     printf(_("  --temp-config=PATH        append contents of PATH to temporary config\n"));
     printf(_("\n"));
     printf(_("Options for using an existing installation:\n"));
@@ -1867,6 +1867,7 @@ regression_main(int argc, char *argv[],
     int            i;
     int            option_index;
     char        buf[MAXPGPATH * 4];
+    char        buf2[MAXPGPATH * 4];

     static struct option long_options[] = {
         {"help", no_argument, NULL, 'h'},
@@ -1882,7 +1883,6 @@ regression_main(int argc, char *argv[],
         {"temp-install", required_argument, NULL, 9},
         {"no-locale", no_argument, NULL, 10},
         {"top-builddir", required_argument, NULL, 11},
-        {"temp-port", required_argument, NULL, 12},
         {"host", required_argument, NULL, 13},
         {"port", required_argument, NULL, 14},
         {"user", required_argument, NULL, 15},
@@ -1956,20 +1956,12 @@ regression_main(int argc, char *argv[],
             case 11:
                 top_builddir = strdup(optarg);
                 break;
-            case 12:
-                {
-                    int            p = atoi(optarg);
-
-                    /* Since Makefile isn't very bright, check port range */
-                    if (p >= 1024 && p <= 65535)
-                        temp_port = p;
-                }
-                break;
             case 13:
                 hostname = strdup(optarg);
                 break;
             case 14:
                 port = atoi(optarg);
+                port_specified_by_user = true;
                 break;
             case 15:
                 user = strdup(optarg);
@@ -2005,8 +1997,13 @@ regression_main(int argc, char *argv[],
         optind++;
     }

-    if (temp_install)
-        port = temp_port;
+    if (temp_install && !port_specified_by_user)
+        /*
+         * To reduce chances of interference with parallel
+         * installations, use a port number starting in the private
+         * range (49152-65535) calculated from the version number.
+         */
+        port = 0xC000 | (PG_VERSION_NUM & 0x3FFF);

     inputdir = make_absolute_path(inputdir);
     outputdir = make_absolute_path(outputdir);
@@ -2107,6 +2104,37 @@ regression_main(int argc, char *argv[],
         }

         /*
+         * Check if there is a postmaster running already.
+         */
+        snprintf(buf2, sizeof(buf2),
+                 SYSTEMQUOTE "\"%s/psql\" -X postgres <%s 2>%s" SYSTEMQUOTE,
+                 bindir, DEVNULL, DEVNULL);
+
+        for (i = 0; i < 16; i++)
+        {
+            if (system(buf2) == 0)
+            {
+                char        s[16];
+
+                if (port_specified_by_user || i == 15)
+                {
+                    fprintf(stderr, _("port %d apparently in use\n"), port);
+                    if (!port_specified_by_user)
+                        fprintf(stderr, _("%s: could not determine an available port\n"), progname);
+                    fprintf(stderr, _("Specify an used port using the --port option or shut down any conflicting
PostgreSQLservers.\n")); 
+                    exit_nicely(2);
+                }
+
+                fprintf(stderr, _("port %d apparently in use, trying %d\n"), port, port+1);
+                port++;
+                sprintf(s, "%d", port);
+                doputenv("PGPORT", s);
+            }
+            else
+                break;
+        }
+
+        /*
          * Start the temp postmaster
          */
         header(_("starting postmaster"));
@@ -2129,13 +2157,10 @@ regression_main(int argc, char *argv[],
          * second or so, but Cygwin is reportedly *much* slower).  Don't wait
          * forever, however.
          */
-        snprintf(buf, sizeof(buf),
-                 SYSTEMQUOTE "\"%s/psql\" -X postgres <%s 2>%s" SYSTEMQUOTE,
-                 bindir, DEVNULL, DEVNULL);
         for (i = 0; i < 60; i++)
         {
             /* Done if psql succeeds */
-            if (system(buf) == 0)
+            if (system(buf2) == 0)
                 break;

             /*
@@ -2180,7 +2205,7 @@ regression_main(int argc, char *argv[],
         postmaster_running = true;

         printf(_("running on port %d with pid %lu\n"),
-               temp_port, (unsigned long) postmaster_pid);
+               port, (unsigned long) postmaster_pid);
     }
     else
     {

Re: Brittleness in regression test setup

От
Alvaro Herrera
Дата:
Peter Eisentraut wrote:
> Peter Eisentraut wrote:
>> Alvaro Herrera wrote:
>>> Is it possible to make it retry in case the chosen port is busy?  I
>>> guess a simple check should suffice, ignoring the obvious race condition
>>> that someone uses the port after you checked it was OK.
>>
>> Well, the whole point of this exercise was to avoid that.  If we had a  
>> way to do a "simple check", we might as well stick to the hardcoded 
>> port and count up from that or something.
>
> Well, duh, the checking is actually pretty simple.  We just try to  
> connect with psql to the candidate port number before starting our own  
> postmaster and see if anyone is already there.

But what if something else is using the port?  I think you could attempt
a bare connect().


Note typo here:

> +                    fprintf(stderr, _("Specify an used port using the --port option or shut down any conflicting
PostgreSQLservers.\n"));
 

Should say "an unused port"

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Brittleness in regression test setup

От
Peter Eisentraut
Дата:
Alvaro Herrera wrote:
>> Well, duh, the checking is actually pretty simple.  We just try to  
>> connect with psql to the candidate port number before starting our own  
>> postmaster and see if anyone is already there.
> 
> But what if something else is using the port?  I think you could attempt
> a bare connect().

Well, that goes beyond the scope of my original problem, which is that 
the regression tests will silently run against a different installation.  If you run psql against a non-PostgreSQL
server,you will hopefully 
 
see more obvious failures.  We could add this in the future, if there 
are complaints from the field.