Re: 7.0.2 crash (maybe linux kernel bug??)

Поиск
Список
Период
Сортировка
От Alfred Perlstein
Тема Re: 7.0.2 crash (maybe linux kernel bug??)
Дата
Msg-id 20001031115937.Z22110@fw.wintelcom.net
обсуждение исходный текст
Ответ на 7.0.2 crash (maybe linux kernel bug??)  (Michael J Schout <mschout@gkg.net>)
Список pgsql-hackers
* Michael J Schout <mschout@gkg.net> [001031 11:22] wrote:
> Hi.
> 
> Ive had a crash in postgresql 7.0.2.  Looking at what happened, I actually
> suspect that this is a filesystem bug, and not a postgresql bug necessarily,
> but I wanted to report it here and see if anyone else had any opinions.
> 
> The platform this happened on was linux (redhat 6.2), kernel 2.2.16 (SMP) dual
> pentium III 500MHz cpus, Mylex DAC960 raid controller running in raid5 mode.
> 
> During regular activity, I got a kernel oops.  Looking at the call trace from
> the kernel, as well as the EIP, I think maybe there is a bug here int the fs
> buffer code, and that htis is a linux kernel problem (not a postgresql
> problem).
> 
> Bug I'm no expert here.. Does this sould correct looking at the kernel erros
> below?
> 
> Sorry if this is off topic.  I just want to make sure this is a kernel bug and
> not a postgresql bug.
> 
> Mike
> 
> The oopses:
> 
> kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000134 
> kernel: current->tss.cr3 = 1a325000, %%cr3 = 1a325000 
> kernel: *pde = 00000000 
> kernel: Oops: 0002 
> kernel: CPU:    0 
> kernel: EIP:    0010:[remove_from_queues+169/328] 
> kernel: EFLAGS: 00010206 
> kernel: eax: 00000100   ebx: 00000002   ecx: df022e40   edx: efba76b8 
> kernel: esi: df022e40   edi: 00000000   ebp: 00000000   esp: da327ea4 
> kernel: ds: 0018   es: 0018   ss: 0018 
> kernel: Process postmaster (pid: 11527, process nr: 51, stackpage=da327000) 
> kernel: Stack: df022e40 c012be79 df022e40 df022e40 00001000 c0142cb8 c0142cc7 df022e40  
> kernel:        ec247140 ffffffea ec0b026c da326000 df022e40 df022e40 df022e40 000a4000  
> kernel:        00000000 da327f08 00000000 00000000 eff29200 00001000 000000a5 000a5000  
> kernel: Call Trace: [refile_buffer+77/184] [ext2_file_write+996/1584] [ext2_file_write+1011/1584]
[kfree_skbmem+51/64][__kfree_skb+162/168] [lockd:__insmod_lockd_O/lib/modules/2.2.16-3smp/fs/lockd.o_M394EA7+-76392/76]
[handle_IRQ_event+90/140] 
 
> kernel:        [sys_write+240/292] [ext2_file_write+0/1584] [system_call+52/56] [startup_32+43/164]  
> kernel: Code: 89 50 34 c7 01 00 00 00 00 89 02 c7 41 34 00 00 00 00 ff 0d  
> kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000100 

Yes, your kernel basically segfaulted, I would get a traceback from your
crashdump and discuss it with the kernel developers.

--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."


> kernel: current->tss.cr3 = 1ba46000, %%cr3 = 1ba46000 
> kernel: *pde = 00000000 
> kernel: Oops: 0000 
> kernel: CPU:    1 
> kernel: EIP:    0010:[find_buffer+104/144] 
> kernel: EFLAGS: 00010206 
> kernel: eax: 00000100   ebx: 00000007   ecx: 00069dae   edx: 00000100 
> kernel: esi: 0000000d   edi: 00003006   ebp: 0005ce4b   esp: e53a19f4 
> kernel: ds: 0018   es: 0018   ss: 0018 
> kernel: Process postmaster (pid: 5545, process nr: 37, stackpage=e53a1000) 
> kernel: Stack: 0005ce4b 00003006 00069dae c012b953 00003006 0005ce4b 00001000 c012bcc6  
> kernel:        00003006 0005ce4b 00001000 00003006 eff29200 00003006 00004e4b ef18c960  
> kernel:        c0141ee7 00003006 0005ce4b 00001000 0005ce4b e53a1bb0 edc3c660 edc3c660  
> kernel: Call Trace: [get_hash_table+23/36] [getblk+30/324] [ext2_new_block+2291/2756] [getblk+271/324]
[ext2_alloc_block+344/356][block_getblk+305/624] [ext2_getblk+256/524]  
 
> kernel:        [ext2_file_write+1308/1584] [__brelse+19/84] [permission+36/248] [dump_seek+53/104] [dump_seek+53/104]
[dump_write+48/84][elf_core_dump+3104/3216] [do_IRQ+82/92]  
 
> kernel:        [tcp_write_xmit+407/472] [__release_sock+36/124] [tcp_do_sendmsg+2125/2144] [inet_sendmsg+0/144]
[cprt+1553/20096][cprt+1553/20096] [cprt+1553/20096] [do_signal+458/724]  
 
> kernel:        [force_sig_info+168/180] [force_sig+17/24] [do_general_protection+54/160] [error_code+45/52]
[signal_return+20/24] 
 
> kernel: Code: 8b 00 39 6a 04 75 15 8b 4c 24 20 39 4a 08 75 0c 66 39 7a 0c  


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael J Schout
Дата:
Сообщение: 7.0.2 crash (maybe linux kernel bug??)
Следующее
От: Peter Eisentraut
Дата:
Сообщение: Restricting permissions on Unix socket