Обсуждение: 7.3.5 initdb failure on Irix 6.5.18
I'm trying to use 7.3.5 (for an upgrade of 7.3.2) on Irix 6.5.18 using the
MIPSpro 7.4.1 compiler. Everything compiles up ok, but 'make check' fails
at the "enabling unlimited row size for system tables..." step with
a core dump of postgres.
The failure is at /backend/access/transam/xlog.c:2544 with an
"unable to locate a valid checkpoint record" panic. This happens
for both 7.3.4 and 7.3.5, either with -O or -g as the CFLAGS value.
Manually running the command being used by initdb:
tmp_check/install/stmgr/pgsql-7.3.5/bin/postgres -F \
-D/stmgr/src/postgresql-7.3.5/src/test/regress/data -O \
-c search_path=pg_catalog template1
gives:
LOG: database system was shut down at 2004-01-15 11:20:44 MST
LOG: ReadRecord: invalid magic number 0000 in log file 0, segment 0, offset 32768
LOG: invalid primary checkpoint record
LOG: ReadRecord: record with zero length at 0/50
LOG: invalid secondary checkpoint record
PANIC: unable to locate a valid checkpoint record
Interestingly, using a copy of an existing database created by the 7.3.2
installation on the same system works fine.
Has anyone fixed this yet? If not, does anyone have hints that I can
pursue since I have the source compiled up with debugging enabled?
--
Craig Ruff NCAR cruff@ucar.edu
(303) 497-1211 P.O. Box 3000
Boulder, CO 80307
Craig Ruff <cruff@ucar.edu> writes: > I'm trying to use 7.3.5 (for an upgrade of 7.3.2) on Irix 6.5.18 using the > MIPSpro 7.4.1 compiler. Everything compiles up ok, but 'make check' fails > at the "enabling unlimited row size for system tables..." step with > a core dump of postgres. Hmm, hard to see what could have broken between 7.3.2 and 7.3.4. > Has anyone fixed this yet? Nope, first we've heard of it. > If not, does anyone have hints that I can pursue since I have the > source compiled up with debugging enabled? It would seem that the culprit must be somewhere in the 7.3.2-to-7.3.4 changes in xlog.c: http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/backend/access/transam/xlog.c.diff?r1=1.109&r2=1.109.2.3 but I sure don't see anything there that looks like a potential portability issue. regards, tom lane
On Thu, Jan 15, 2004 at 04:42:50PM -0500, Tom Lane wrote: > It would seem that the culprit must be somewhere in the 7.3.2-to-7.3.4 > changes in xlog.c: > ... > but I sure don't see anything there that looks like a potential > portability issue. I have some further info. 7.3.5 compiled with MIPSpro 7.4.1 is broken with respect to the transaction log files. Restarting my 7.3.5 install results in similar errors. However, when compiled with gcc, 7.3.5 initdb works correctly. I'm in the process of testing the import of the 7.3.2 database and running some transactions to see if the restart works. Also, PostgreSQL 7.4.1 compiled with MIPSpro 7.4.1 appears to work (at least the regression test).
Ok, I have further information on this problem. I believe it is a compiler problem. PostgreSQL version 7.3.3 is also affected when compiled with the MIPSpro 7.4.1 compiler, but when compiled with MIPSpro 7.4 it is ok. Using the gcc compiled version of backend/access/transam/xlog.c, I have gotten the regression test to work. Next week I'll have to further nail it down so I can send a bug report to SGI. Just replacing XLogFlush with the gcc compiled version allows initdb to finish, but the regression tests shows there are other problems. So, a note should probably be made in the documentation that for the moment, MIPSpro 7.4.1 should probably be avoided.
Craig Ruff <cruff@ucar.edu> writes:
> So, a note should probably be made in the documentation that for the
> moment, MIPSpro 7.4.1 should probably be avoided.
Appreciate the followup. Let us know if it emerges that the PG code is
doing something unportable. (It could be that the compiler is doing
something that's legal per the ANSI C standard but breaks our code.)
regards, tom lane
Here is what I discovered about this problem.
The MIPSpro 7.4.1 C compiler apparently has a structure assignment code
generation bug that is triggered at backend/access/transam/xlog.c:2683
LogwrtResult.Write = LogwrtResult.Flush = EndOfLog;
EndOfLog and LogwrtResult.Write are correct, but LogwrtResult.Flush ends
up corrupted.
I've opened a problem report with SGI (case ID 2505985 "MIPSpro 7.4.1 C
structure assignment bug") for those of you who need to track it. From
what I can see, PostgreSQL 7.3.x is vulnerable, PostgreSQL 7.4.1 seems
to pass its regression test, but I'd probably think twice about using
it when compiled with MIPSpro 7.4.1.
Everything seems ok when compiled with the SGI provided version of GCC 3.2.2.