Обсуждение: Strange pgsql crash on MacOSX

Поиск
Список
Период
Сортировка

Strange pgsql crash on MacOSX

От
Shane Ambler
Дата:
I have a dual G4 1.25Ghz with 2GB RAM running Mac OSX 10.4.8 and
PostgreSQL 8.2.0

This only happened to me today and with everything I have tried it 
always happens now - had been running fine before.

The only thing I can think of that has changed in the last few days is I
have installed the last 2 security updates from Apple and the X11 update 
(X11 1.1.3) that Apple released a while ago -

http://www.apple.com/support/downloads/securityupdate2006008ppc.html
http://www.apple.com/support/downloads/securityupdate20060071048clientppc.html

the first one I can't see having anything to do with postgres as it is I
believe only updating Java. The other one updates a few different areas
and may be the culprit.

I can't think of anything else I have changed just recently - certainly 
not in the last couple of days.

To test and try and track down the cause I have restarted my machine 
then started by unzipping the 8.2.0 released source and done the 
following steps (this example is with clean data files and everything 
default - the startup script has been there a while and using pg_ctl 
instead makes no difference) make check passes all test -

./configure --prefix=/usr/local/pgsql
make check
sudo make install
cd /usr/local/pgsql
sudo mkdir data
sudo chown pgsql:pgsql data
sudo chmod 700 data
sudo -u pgsql /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
sudo /Library/StartupItems/PostgreSQL/PostgreSQL start

Then I get the following -
[devbox:~] shane% psql
Welcome to psql 8.2.0, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms       \h for help with SQL commands       \? for help with psql commands
\gor terminate with semicolon to execute query       \q to quit
 

postgres=# \q
psql(24931) malloc: *** error for object 0x180a800: incorrect checksum
for freed object - object was probably modified after being freed, break
at szone_error to debug
psql(24931) malloc: *** set a breakpoint in szone_error to debug
Segmentation fault
[devbox:~] shane%

The serverlog gives me -
[devbox:local/pgsql/data] root# cat serverlog
LOG:  database system was shut down at 2006-12-23 12:27:44 CST
LOG:  checkpoint record is at 0/42BEB8
LOG:  redo record is at 0/42BEB8; undo record is at 0/0; shutdown TRUE
LOG:  next transaction ID: 0/593; next OID: 10820
LOG:  next MultiXactId: 1; next MultiXactOffset: 0
LOG:  database system is ready


Apple's crashreporter gives me -

Date/Time:      2006-12-23 12:28:21.499 +1030
OS Version:     10.4.8 (Build 8L127)
Report Version: 4

Command: psql
Path:    /usr/local/pgsql/bin/psql
Parent:  tcsh [294]

Version: ??? (???)

PID:    24931
Thread: 0

Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_INVALID_ADDRESS (0x0001) at 0x3430616b

Thread 0 Crashed:
0   libSystem.B.dylib     0x90006cd8 szone_free + 3148
1   libSystem.B.dylib     0x900152d0 fclose + 176
2   libedit.2.dylib       0x96b5c334 history_end + 1632
3   libedit.2.dylib       0x96b5c7bc history + 468
4   libedit.2.dylib       0x96b5ec58 write_history + 84
5   psql                  0x00008350 saveHistory + 208
6   psql                  0x00008428 finishInput + 120
7   libSystem.B.dylib     0x90014578 __cxa_finalize + 260
8   libSystem.B.dylib     0x90014444 exit + 36
9   psql                  0x00001d00 _start + 764
10  psql                  0x00001a00 start + 48

Thread 0 crashed with PPC Thread State 64:  srr0: 0x0000000090006cd8 srr1: 0x000000000000d030   vrsave:
0x0000000000000000   cr: 0x42002444          xer: 0x0000000020000001   lr:
 
0x0000000090006ca4  ctr: 0x00000000900143a0    r0: 0x0000000090006ca4   r1: 0x00000000bffff610   r2:
0x0000000042002442   r3: 0x000000000000000d    r4: 0x0000000000000000   r5: 0x000000000000000d   r6:
0x0000000080808080   r7: 0x0000000000000003    r8: 0x0000000039333100   r9: 0x00000000bffff545  r10:
0x0000000000000000  r11: 0x0000000042002442   r12: 0x00000000900143a0  r13: 0x0000000000000000  r14:
0x0000000000000000  r15: 0x0000000000000000   r16: 0x0000000000000000  r17: 0x0000000000000052  r18:
0x0000000000000400  r19: 0x0000000000000054   r20: 0x00000000020000a4  r21: 0x000000000180a800  r22:
0x00000000a0001fac  r23: 0x00000000020000a8   r24: 0x0000000000000002  r25: 0x0000000000000002  r26:
0x0000000000000001  r27: 0x0000000034306167   r28: 0x0000000001800000  r29: 0x000000000180a400  r30:
0x000000002e616767  r31: 0x00000000900060a0

Binary Images Description:    0x1000 -    0x36fff psql     /usr/local/pgsql/bin/psql   0x3f000 -    0x54fff
libpq.5.dylib    /usr/local/pgsql/lib/libpq.5.dylib
 
0x8fe00000 - 0x8fe51fff dyld 45.3    /usr/lib/dyld
0x90000000 - 0x901bcfff libSystem.B.dylib     /usr/lib/libSystem.B.dylib
0x90214000 - 0x90219fff libmathCommon.A.dylib
/usr/lib/system/libmathCommon.A.dylib
0x9110f000 - 0x9111dfff libz.1.dylib     /usr/lib/libz.1.dylib
0x969c3000 - 0x969f1fff libncurses.5.4.dylib     /usr/lib/libncurses.5.4.dylib
0x96b4d000 - 0x96b63fff libedit.2.dylib     /usr/lib/libedit.2.dylib

Model: PowerMac3,6, BootROM 4.4.8f2, 2 processors, PowerPC G4  (3.2),
1.25 GHz, 2 GB
Graphics: NVIDIA GeForce4 MX, GeForce4 MX, AGP, 32 MB
Memory Module: DIMM0/J21, 512 MB, DDR SDRAM, PC2600U-25330
Memory Module: DIMM1/J22, 512 MB, DDR SDRAM, PC2600U-25330
Memory Module: DIMM2/J23, 512 MB, DDR SDRAM, PC2600U-25330
Memory Module: DIMM3/J20, 512 MB, DDR SDRAM, PC2600U-25330
AirPort: AirPort, 9.52
Network Service: Built-in Ethernet, Ethernet, en0
PCI Card: pci-bridge, pci, SLOT-3
PCI Card: firewire, ieee1394, 1x0
PCI Card: usb, usb, 1x1
PCI Card: usb, usb, 1x1
PCI Card: pci167e,225a, , 1x1
Parallel ATA Device: LITE-ON DVD SOHD-167T,
Parallel ATA Device: WDC WD1200JB-00FUA0, 111.79 GB
Parallel ATA Device: IBM-IC35L120AVVA07-0, 115.04 GB
USB Device: Apple Pro Keyboard, Mitsumi Electric, Up to 1.5 Mb/sec, 500 mA
USB Device: i350, Canon, Up to 12 Mb/sec, 500 mA
FireWire Device: unknown_device, unknown_value, Up to 400 Mb/sec



-- 

Shane Ambler
pgSQL@007Marketing.com

Get Sheeky @ http://Sheeky.Biz


Re: Strange pgsql crash on MacOSX

От
Tom Lane
Дата:
Shane Ambler <pgsql@007Marketing.com> writes:
> postgres=# \q
> psql(24931) malloc: *** error for object 0x180a800: incorrect checksum
> for freed object - object was probably modified after being freed, break
> at szone_error to debug
> psql(24931) malloc: *** set a breakpoint in szone_error to debug
> Segmentation fault

I think we've seen something like this before in connection with
readline/libedit follies.  Does the crash go away if you invoke
psql with "-n" option?  If so, exactly which version of readline or
libedit are you using?

FWIW, I do not see this on a fully up-to-date 10.4.8 G4 laptop.
I see

$ ls -l /usr/lib/libedit*
-rwxr-xr-x   1 root  wheel  112404 Sep 29 20:59 /usr/lib/libedit.2.dylib
lrwxr-xr-x   1 root  wheel      15 Apr 26  2006 /usr/lib/libedit.dylib -> libedit.2.dylib
$

so it seems that Apple did update libedit not too long ago ...
        regards, tom lane


Re: Strange pgsql crash on MacOSX

От
Shane Ambler
Дата:
Shane Ambler wrote:
> Tom Lane wrote:
>> Shane Ambler <pgsql@007Marketing.com> writes:
>>> postgres=# \q
>>> psql(24931) malloc: *** error for object 0x180a800: incorrect checksum
>>> for freed object - object was probably modified after being freed, break
>>> at szone_error to debug
>>> psql(24931) malloc: *** set a breakpoint in szone_error to debug
>>> Segmentation fault
>>
>> I think we've seen something like this before in connection with
>> readline/libedit follies.  Does the crash go away if you invoke
>> psql with "-n" option?  If so, exactly which version of readline or
>> libedit are you using?


> 
> psql -n stops the error.
> 

I just found out the problem.

psql_history - I had tried to copy from a text file earlier that was 
utf8 and came up with some errors, I guess these got into the history 
file and stuffed it up.

Renamed it so it created a new one and all is fine now.

-- 

Shane Ambler
pgSQL@007Marketing.com

Get Sheeky @ http://Sheeky.Biz


Re: Strange pgsql crash on MacOSX

От
Tom Lane
Дата:
Shane Ambler <pgsql@007Marketing.com> writes:
> I just found out the problem.
> psql_history - I had tried to copy from a text file earlier that was 
> utf8 and came up with some errors, I guess these got into the history 
> file and stuffed it up.

Hm, so the question is: is it our bug or Apple's?  If you kept the
busted history file, would you be willing to send me a copy?
        regards, tom lane


Re: Strange pgsql crash on MacOSX

От
Tom Lane
Дата:
Shane Ambler <pgsql@007Marketing.com> writes:
> Tom Lane wrote:
>> Hm, so the question is: is it our bug or Apple's?  If you kept the
>> busted history file, would you be willing to send me a copy?

> The zip file attached has the psql_history file that crashes when 
> quiting but doesn't appear to contain the steps I done when it first 
> crashed.

So the answer is: it's Apple's bug, or at least not ours.  libedit
contains a typo that causes it to potentially fail when saving strings
exceeding 256 bytes.  Check out this code (around line 730 in history.c):
    len = strlen(ev.str) * 4;    if (len >= max_size) {        char *nptr;        max_size = (len + 1023) & 1023;
nptr = h_realloc(ptr, max_size);
 

I think the intent of the max_size recalculation is to select the next
1K boundary larger than "len", but it actually produces a number *less*
than 1K.  Probably "(len + 1023) & ~1023" was meant ... but even that
is wrong if len is exactly a multiple of 1024, because it will fail to
round up.  So the buffer is realloc'd too small, and that results in
a potential memory clobber if the history entry is less than 1K, and a
guaranteed clobber if it's more.

The source code available from Apple shows that they got this code from
NetBSD originally

/*    $NetBSD: history.c,v 1.25 2003/10/18 23:48:42 christos Exp $    */

so this may well be a pretty generic *BSD bug.  Anyone clear on who to
report it to?  I have no idea if libedit is an independent project...
        regards, tom lane


Re: Strange pgsql crash on MacOSX

От
Tom Lane
Дата:
I wrote:
> The source code available from Apple shows that they got this code from
> NetBSD originally
> /*    $NetBSD: history.c,v 1.25 2003/10/18 23:48:42 christos Exp $    */
> so this may well be a pretty generic *BSD bug.  Anyone clear on who to
> report it to?  I have no idea if libedit is an independent project...

Some digging in the NetBSD CVS shows that they found both parts of this
bug more than two years ago:

http://cvsweb.netbsd.org/bsdweb.cgi/src/lib/libedit/history.c.diff?r1=1.25&r2=1.27&f=h

so the short and sweet answer is that Apple is behind the times.
        regards, tom lane