Chris Browne <cbbrowne@acm.org> writes:
> tgl@sss.pgh.pa.us (Tom Lane) writes:
>> (My guess is that the problem is a compiler or libc bug anyway,
>> given that one report says that replacing a memcpy call with an
>> equivalent loop makes the failure go away.)
> It seems unlikely to be a compiler bug as the same issue has been
> reported with both GCC and IBM XLC. I could believe it being a libc
> bug...
As best I can tell after poking at it on Stefan's machine, it's a linker
bug, or else there is something strange about memcpy as compared to,
say, memcmp. A function pointer to memcmp works, a function pointer to
memcpy contains a bogus value that points entirely outside the program's
address space. This despite the assembly code that generates them
looking just the same in both cases, viz
LC..12:.tc memcmp[TC],memcmp[DS]
LC..14:.tc memcpy[TC],memcpy[DS]
Even more interesting, if you start the postmaster under gdb and examine
the pointer, then set a breakpoint at "main" and say "run", by the time
control arrives at main() the bogus value has changed to a different
bogus value. So something in the basic C runtime support is frobbing it
--- incorrectly :-(. I think all the signs point to incorrect
relocation data generated by the linker, though I have no idea why only
memcpy would be affected.
> It would be terribly disappointing to have to report both internally
> and externally that AIX 5.3 is not a usable platform for recent
> releases of PostgreSQL...
According to Stefan it broke between 5.3ML1 and 5.3ML3. I suggest
filing a defect report with IBM. We're not going to stop using memcpy
because one version of one platform is broken.
regards, tom lane