I am seeing a reproducable kernel BUG. The machine is a Dell PowerEdge 2650, dual processor, 2.8GHz Xeon. The box has a terrabye software RAID (raid 5) that it shares out over NFS to 108 clients. When booting the system with all clients up and with open mounts to the machine, within 5 minutes of finishing the init process, the following BUG will occur: kernel BUG at inode.c:299! invalid operand: 0000 lp parport nfsd lockd sunrpc autofs tg3 ipt_REJECT ipt_state iptable_nat ip_conntrack iptable_filter ip_tables ide-scsi ide-cd cdrom st keybdev mousedev hid i CPU: 0 EIP: 0060:[<c015d78f>] Not tainted EFLAGS: 00010202 EIP is at sync_unlocked_inodes [kernel] 0x8f (2.4.20-28.9) eax: f687b464 ebx: 0000000f ecx: f687b464 edx: f687b400 esi: f6723300 edi: f687b45c ebp: f687b400 ds: 0068 es: 0068 ss: 0068 Process kupdated (pid: 9, stackpage=c44c1000) Stack: f4ef6580 00000000 c44c0000 c44c0000 c0262c43 c44c0307 00000000 c014c0e8 c44c0000 c0262c43 c014c484 c031b7a0 00000001 00000068 c014c3e0 00000000 00000000 c010742d c036d7f4 00000000 00000000 Call Trace: [<c014c0e8>] sync_old_buffers [kernel] 0x8 (0xc44c1fc8)) [<c014c484>] kupdate [kernel] 0xa4 (0xc44c1fd4))Jan 21 00:30:49 cstore1 kernel: [<c014c3e0>] kupdate [kernel] 0x0 (0xc44c1fe4)) [<c010742d>] kernel_thread_helper [kernel] 0x5 (0xc44c1ff0)) Code: 0f 0b 2b 01 6f 2e 26 c0 89 d8 83 c8 08 83 e0 f8 89 86 fc 00 I've reproduced this with the 2.4.20-20.9smp, 2.4.20-24.9smp, 2.4.20-28.9smp, and 2.4.20-28.9 kernels. With the SMP kernels, once the BUG hits, the systems hangs and becomes completely unresponsive. With the uni-processor kernel, the BUG hits, and it keeps going. I've had the system up for 12 hours now with the UP 2.4.20-28.9 kernel and the BUG only hit the one time shortly after boot. If you need nay more info, please let me know.
Turns out that changing out my SCSI cable caused this problem to go away.
I just saw this on RHEL AS 3.0 2.4.21-9ELhugmem kernel. (inode.c: line 300). Not reproducable so far.