From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.8) Gecko/20050511 Firefox/1.0.4 Description of problem: We are doing a long time and high load test for an our server program on Redhat AS4.0. After about 20 hours running, our server program was down and the following messages were outputed in /var/log/messages. Dec 27 21:00:52 ruixj kernel: Unable to handle kernel NULL pointer dereference at virtual address 000000b0 Dec 27 21:00:52 ruixj kernel: printing eip: Dec 27 21:00:52 ruixj kernel: e00326e0 Dec 27 21:00:52 ruixj kernel: *pde = 04e84067 Dec 27 21:00:52 ruixj kernel: Oops: 0000 [#1] Dec 27 21:00:52 ruixj kernel: Modules linked in: i915 parport_pc lp parport autofs4 sunrpc dm_mod button battery ac md5 ipv6 uhci_hcd ehci_hcd snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore e100 mii floppy ext3 jbd Dec 27 21:00:52 ruixj kernel: CPU: 0 Dec 27 21:00:52 ruixj kernel: EIP: 0060:[<e00326e0>] Not tainted VLI Dec 27 21:00:52 ruixj kernel: EFLAGS: 00010202 (2.6.9-5.EL) Dec 27 21:00:52 ruixj kernel: EIP is at journal_start+0x21/0x9e [jbd] Dec 27 21:00:52 ruixj kernel: eax: ffffffe2 ebx: 000000b0 ecx: df57a400 edx: 0000005d Dec 27 21:00:52 ruixj kernel: esi: df762800 edi: c5ed5000 ebp: 000081b6 esp: c5ed5edc Dec 27 21:00:52 ruixj kernel: ds: 007b es: 007b ss: 0068 Dec 27 21:00:52 ruixj kernel: Process edldapd (pid: 17944, threadinfo=c5ed5000 task=c842c700) Dec 27 21:00:52 ruixj kernel: Stack: e011d120 ceb101a4 ceb101a4 e01098e9 ddb85ac8 00000000 e011d120 ceb101a4 Dec 27 21:00:52 ruixj kernel: ceb101a4 000081b6 c0172887 c5ed5f58 ddb85ac8 ddb85ac8 ceb101a4 ceb82e10 Dec 27 21:00:52 ruixj kernel: c5ed5f58 c0172c58 c5ed5f58 00000000 00000000 00000006 000001b6 00008243 Dec 27 21:00:52 ruixj kernel: Call Trace: Dec 27 21:00:52 ruixj kernel: [<e01098e9>] ext3_create+0x25/0xb3 [ext3] Dec 27 21:00:52 ruixj kernel: [<c0172887>] vfs_create+0xb8/0xef Dec 27 21:00:52 ruixj kernel: [<c0172c58>] open_namei+0x181/0x57e Dec 27 21:00:52 ruixj kernel: [<c0161412>] filp_open+0x23/0x3c Dec 27 21:00:52 ruixj kernel: [<c03003b2>] __cond_resched+0x14/0x3b Dec 27 21:00:52 ruixj kernel: [<c01d8e46>] direct_strncpy_from_user+0x3e/0x5d Dec 27 21:00:52 ruixj kernel: [<c01618e9>] sys_open+0x31/0x7d Dec 27 21:00:52 ruixj kernel: [<c0301bfb>] syscall_call+0x7/0xb Dec 27 21:00:52 ruixj kernel: Code: 42 10 89 42 14 5b 89 f8 5f c3 57 bf 00 f0 ff ff 56 89 c6 53 21 e7 8b 07 85 f6 8b 98 a8 05 00 00 b8 e2 ff ff ff 74 7d 85 db 74 34 <8b> 03 39 30 74 29 68 d0 ce 03 e0 68 12 01 00 00 68 ae cd 03 e0 Version-Release number of selected component (if applicable): Red Hat Enterprise Linux AS 4 2.6.9-5.EL #1 How reproducible: Didn't try Steps to Reproduce: 1. Doing a long time and high load test for an our server program. 2. After about 20 hours running, our server program was down and the above messages were outputed in /var/log/messages. 3. Actual Results: Our server program was down and the following messages were outputed in /var/log/messages. Dec 27 21:00:52 ruixj kernel: Unable to handle kernel NULL pointer dereference at virtual address 000000b0 Dec 27 21:00:52 ruixj kernel: printing eip: Dec 27 21:00:52 ruixj kernel: e00326e0 Dec 27 21:00:52 ruixj kernel: *pde = 04e84067 Dec 27 21:00:52 ruixj kernel: Oops: 0000 [#1] Dec 27 21:00:52 ruixj kernel: Modules linked in: i915 parport_pc lp parport autofs4 sunrpc dm_mod button battery ac md5 ipv6 uhci_hcd ehci_hcd snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore e100 mii floppy ext3 jbd Dec 27 21:00:52 ruixj kernel: CPU: 0 Dec 27 21:00:52 ruixj kernel: EIP: 0060:[<e00326e0>] Not tainted VLI Dec 27 21:00:52 ruixj kernel: EFLAGS: 00010202 (2.6.9-5.EL) Dec 27 21:00:52 ruixj kernel: EIP is at journal_start+0x21/0x9e [jbd] Dec 27 21:00:52 ruixj kernel: eax: ffffffe2 ebx: 000000b0 ecx: df57a400 edx: 0000005d Dec 27 21:00:52 ruixj kernel: esi: df762800 edi: c5ed5000 ebp: 000081b6 esp: c5ed5edc Dec 27 21:00:52 ruixj kernel: ds: 007b es: 007b ss: 0068 Dec 27 21:00:52 ruixj kernel: Process edldapd (pid: 17944, threadinfo=c5ed5000 task=c842c700) Dec 27 21:00:52 ruixj kernel: Stack: e011d120 ceb101a4 ceb101a4 e01098e9 ddb85ac8 00000000 e011d120 ceb101a4 Dec 27 21:00:52 ruixj kernel: ceb101a4 000081b6 c0172887 c5ed5f58 ddb85ac8 ddb85ac8 ceb101a4 ceb82e10 Dec 27 21:00:52 ruixj kernel: c5ed5f58 c0172c58 c5ed5f58 00000000 00000000 00000006 000001b6 00008243 Dec 27 21:00:52 ruixj kernel: Call Trace: Dec 27 21:00:52 ruixj kernel: [<e01098e9>] ext3_create+0x25/0xb3 [ext3] Dec 27 21:00:52 ruixj kernel: [<c0172887>] vfs_create+0xb8/0xef Dec 27 21:00:52 ruixj kernel: [<c0172c58>] open_namei+0x181/0x57e Dec 27 21:00:52 ruixj kernel: [<c0161412>] filp_open+0x23/0x3c Dec 27 21:00:52 ruixj kernel: [<c03003b2>] __cond_resched+0x14/0x3b Dec 27 21:00:52 ruixj kernel: [<c01d8e46>] direct_strncpy_from_user+0x3e/0x5d Dec 27 21:00:52 ruixj kernel: [<c01618e9>] sys_open+0x31/0x7d Dec 27 21:00:52 ruixj kernel: [<c0301bfb>] syscall_call+0x7/0xb Dec 27 21:00:52 ruixj kernel: Code: 42 10 89 42 14 5b 89 f8 5f c3 57 bf 00 f0 ff ff 56 89 c6 53 21 e7 8b 07 85 f6 8b 98 a8 05 00 00 b8 e2 ff ff ff 74 7d 85 db 74 34 <8b> 03 39 30 74 29 68 d0 ce 03 e0 68 12 01 00 00 68 ae cd 03 e0 Expected Results: there is no error. Additional info:
000006bf <journal_start>: 6bf: 57 push %edi 6c0: bf 00 f0 ff ff mov $0xfffff000,%edi 6c5: 56 push %esi 6c6: 89 c6 mov %eax,%esi 6c8: 53 push %ebx 6c9: 21 e7 and %esp,%edi 6cb: 8b 07 mov (%edi),%eax 6cd: 85 f6 test %esi,%esi if (!journal) 6cf: 8b 98 a8 05 00 00 mov 0x5a8(%eax),%ebx journal_current_handle() ?? 6d5: b8 e2 ff ff ff mov $0xffffffe2,%eax -EROFS (-30) 6da: 74 7d je 759 <journal_start+0x9a> if no journal return -EROFS 6dc: 85 db test %ebx,%ebx if (handle) BUT handle/%ebx is 0xb0?! 6de: 74 34 je 714 <journal_start+0x55> if no handle jump to new_handle 6e0: 8b 03 mov (%ebx),%eax <-- died here (try to use %ebx/handle) Unable to handle kernel NULL pointer dereference at virtual address 000000b0 kernel: eax: ffffffe2 ebx: 000000b0 ecx: df57a400 edx: 0000005d kernel: esi: df762800 edi: c5ed5000 ebp: 000081b6 esp: c5ed5edc kernel: ds: 007b es: 007b ss: 0068 it looks like current->journal_info is corrupt, which could be due to any number of reasons, all impossible to tell from the info here, I'm afraid. Has this been seen since?
Please reopen if you still have this issue.