From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050910 Red Hat/1.7.10-1.1.3.2.centos3 Description of problem: After installed the RHEL 4 on a Dell poweredge 2650, which use megaraid level 5 (no LVM), the system serves as a mysql database server for 2 weeks and then suddenly got a kernel panic as logged in the /var/log/message as follows: Sep 27 05:51:29 dbhost kernel: Unable to handle kernel NULL pointer dereference at virtual address 0000000c Sep 27 05:51:29 dbhost kernel: printing eip: Sep 27 05:51:29 dbhost kernel: c01b6959 Sep 27 05:51:29 dbhost kernel: *pde = 2f141001 Sep 27 05:51:29 dbhost kernel: Oops: 0000 [#1] Sep 27 05:51:29 dbhost kernel: SMP Sep 27 05:51:29 dbhost kernel: Modules linked in: ipt_REJECT iptable_filter ip_tables nfsd exportfs lockd sunrpc md5 ipv6 dm_mod button battery ac ohci_hcd tg3 floppy aic7xxx sg ext3 jbd megaraid_mbox megaraid_mm sd_mod scsi_mod Sep 27 05:51:29 dbhost kernel: CPU: 2 Sep 27 05:51:29 dbhost kernel: EIP: 0060:[<c01b6959>] Not tainted VLI Sep 27 05:51:29 dbhost kernel: EFLAGS: 00010246 (2.6.9-11.ELsmp) Sep 27 05:51:29 dbhost kernel: EIP is at rb_insert_color+0x19/0xc1 Sep 27 05:51:29 dbhost kernel: eax: c2266208 ebx: c2266f48 ecx: 45965f46 edx: f74d1480 Sep 27 05:51:29 dbhost kernel: esi: 00000000 edi: c2266208 ebp: f74d1480 esp: cbd4be28 Sep 27 05:51:29 dbhost kernel: ds: 007b es: 007b ss: 0068 Sep 27 05:51:29 dbhost kernel: Process bash (pid: 1718, threadinfo=cbd4b000 task=f7734930) Sep 27 05:51:29 dbhost kernel: Stack: c2266200 c2266f40 c226622a c2266f54 f889da7d f74d1480 c2266f48 f067e55e Sep 27 05:51:29 dbhost kernel: 45965f46 cf8f90c0 d364df84 cf8f9ff8 cbd4bec0 f88a2b3c cf8f90c0 0000000a Sep 27 05:51:29 dbhost kernel: f4518b80 00000000 00000000 f09520f8 f74d1480 f4518b80 f88a2bd5 cbd4bec0 Sep 27 05:51:29 dbhost kernel: Call Trace: Sep 27 05:51:29 dbhost kernel: [<f889da7d>] ext3_htree_store_dirent+0x147/0x151 [ext3] Sep 27 05:51:29 dbhost kernel: [<f88a2b3c>] htree_dirblock_to_tree+0x78/0xb6 [ext3] Sep 27 05:51:29 dbhost kernel: [<f88a2bd5>] ext3_htree_fill_tree+0x5b/0x176 [ext3] Sep 27 05:51:29 dbhost kernel: [<f889dc4e>] ext3_dx_readdir+0x112/0x198 [ext3] Sep 27 05:51:29 dbhost kernel: [<c0165ee0>] filldir64+0x0/0x11a Sep 27 05:51:29 dbhost kernel: [<f889d551>] ext3_readdir+0x8c/0x3a0 [ext3] Sep 27 05:51:29 dbhost kernel: [<c0165ee0>] filldir64+0x0/0x11a Sep 27 05:51:29 dbhost kernel: [<c0165ee0>] filldir64+0x0/0x11a Sep 27 05:51:29 dbhost kernel: [<c0165c1d>] vfs_readdir+0x7d/0xa5 Sep 27 05:51:29 dbhost kernel: [<c016605f>] sys_getdents64+0x65/0x9f Sep 27 05:51:29 dbhost kernel: [<c02c7377>] syscall_call+0x7/0xb Sep 27 05:51:29 dbhost kernel: Code: 75 05 89 50 08 eb 07 89 50 0c eb 02 89 13 89 11 5b c3 55 89 d5 57 89 c7 56 53 e9 9b 00 00 00 83 7b 04 00 0f 85 9b 00 00 00 8b 33 <8b> 46 0c 39 c3 75 3a 8b 46 08 85 c0 74 06 83 78 04 00 74 37 39 Sep 27 05:51:29 dbhost kernel: <0>Fatal exception: panic in 5 seconds Version-Release number of selected component (if applicable): kernel-smp-2.6.9-11.EL How reproducible: Didn't try Steps to Reproduce: 1.reboot it 2.wait it to crash again 3.view the error from log and console Additional info:
The oops is coming from the rbtree code, in rb_insert_color(): void rb_insert_color(struct rb_node *node, struct rb_root *root) { struct rb_node *parent, *gparent; while ((parent = node->rb_parent) && parent->rb_color == RB_RED) { gparent = parent->rb_parent; if (parent == gparent->rb_left) and "gparent" is NULL. This is a corruption in the core rbtree data structure; it's not obviously ext3's fault, as the rbtree code is entirely independent of ext3. This could be bad hardware; it could be some other kernel code stomping on the memory; or a genuine bug; but I've not seen such a bug reported before, nor has there been any change recently in upstream kernels in this area. To diagnose this further will need support help; there isn't enough information here to start any sort of engineering fix. For official Red Hat Enterprise Linux support, please log into the Red Hat support website at http://www.redhat.com/support and file a support ticket, or alternatively contact Red Hat Global Support Services at 1-888-RED-HAT1 to speak directly with a support associate and escalate an issue.