Обсуждение: inconsistant regression test results...

Поиск
Список
Период
Сортировка

inconsistant regression test results...

От
Vikram Kulkarni
Дата:
I'm trying to build+install Postgresql 7.2.1 on a OpenBSD 3.1-stable
computer. The first time I built it, the 12/79 of the regression tests
failed. This scared me, so I did a gmake distclean and then reconfigured
and rebuilt everything. This time, 14/79 tests failed. It was getting
late (or rather, early, the sun was coming up), so I decided to put
things off until later.

This morning I got back to it. I redownloaded the src distribution, made
sure that its MP5 hash matched the expected, and then rebuilt
everything useing the following configure options:

./configure --prefix=/usr/local/encap/postgresql-7.2.1
--sysconfdir=/etc/postgresql --enable-multibyte --with-CXX
--with-openssl

This time, 11/79 test failed. This got me wondering, so I reran the
entire process (untaring, configuring, gmake'ing, and gmake check'ing)
three more times. Different results each time (14, 15, then 10). I have
saved the regression.out and regresson.diffs from each of these last
four tests.  You can seem them here:
http://vvk.brownforces.org/postgresql-regression/

I've read the doc's[1] and understand that some of the tests will
occasionally give different values, but I did not expect tests like
join, subselect, and arrays (and others) to give inconsistant results.
Is this expeceted?

-Vik

[1] specifically:
http://www.postgresql.org/idocs/index.php?regress-evaluation.html

--
Vikram Vinayak Kulkarni   Ultimately, all things are known because
vkulkarn@uiuc.edu         you want to believe you know.
vkulkarn@brownforces.org                            -Zensunni Koan

Re: inconsistant regression test results...

От
Tom Lane
Дата:
Vikram Kulkarni <vkulkarn@brownforces.org> writes:
> I'm trying to build+install Postgresql 7.2.1 on a OpenBSD 3.1-stable
> computer. The first time I built it, the 12/79 of the regression tests
> failed. This scared me, so I did a gmake distclean and then reconfigured
> and rebuilt everything. This time, 14/79 tests failed. ...
> This time, 11/79 test failed. This got me wondering, so I reran the
> entire process (untaring, configuring, gmake'ing, and gmake check'ing)
> three more times. Different results each time (14, 15, then 10).

It looks to me like the primary failures are that tests abort with
either
    psql: Server process fork() failed: Resource temporarily unavailable
or
    psql: could not send SSL negotiation packet: Broken pipe

Some later tests may then fail because they expect to find tables or
data created by the un-executed earlier tests.

The fork-failed messages suggest very strongly that you are running out
of kernel resources when you get more than a dozen or so server
processes going.  Perhaps you are too low on swap space, or need to
enlarge the kernel's file table size.  You could try to confirm this
by running the regression tests serially instead of in parallel (use
the installcheck option); or you could modify the parallel_schedule
file to break apart the more highly parallel test sets into smaller
groups.  If the tests pass that way then the problem is triggered by
load, not by any specific test.

Not sure about the SSL complaint, but I suspect it's the same problem at
bottom.  You should look in the postmaster log file that's generated by
the make check run, and see if you can find what gets logged by the
postmaster when one of those failures is seen on the client side.

            regards, tom lane

Re: inconsistant regression test results...

От
Tom Lane
Дата:
I said:
> The fork-failed messages suggest very strongly that you are running out
> of kernel resources when you get more than a dozen or so server
> processes going.  Perhaps you are too low on swap space, or need to
> enlarge the kernel's file table size.

It's also possible that you are hitting a kernel limit on number of
processes for a single user ID.  See the TIP near the bottom of
http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/regress-run.html

            regards, tom lane

Re: inconsistant regression test results...

От
Vikram Kulkarni
Дата:
On Sat, Aug 03, 2002 at 01:57:05PM -0400, Tom Lane wrote:
> Tom Lane wrote:
> >
> > The fork-failed messages suggest very strongly that you are running
> > out of kernel resources when you get more than a dozen or so server
> > processes going.  Perhaps you are too low on swap space, or need to
> > enlarge the kernel's file table size.
>
> It's also possible that you are hitting a kernel limit on number of
> processes for a single user ID.  See the TIP near the bottom of
> http://www.ca.postgresql.org/users-lounge/docs/7.2/postgres/regress-run.html

Doh. That was it. I copied serial_schedule over parallel_schedule then
all of the test passed. Now I feel silly for not noticing that note...

Thanks alot.

-Vik

--
Vikram Vinayak Kulkarni  you can take the poster out of .test
vkulkarn@uiuc.edu        but you can't take .test out of the
vkulkarn@brownforces.org poster.
                                                  -Jason Zych