From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0; Hewlett-Packard IE5.5-SP2) Description of problem: Kernel Panics during umount. During file system testing between one and six hours of successfull testing. A umount will not be handled correctly by the kernel. At this point it then appears that all system resources are put toward this process which is always a umount which seems to panic the kernel which in turns causes all processes on given machine to time out. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1.Mount Numerous file systems 2.Execute lots of operations on all of the file systems 3.Umount several file systems in paralell. Actual Results: Oops Expected Results: No Oops Additional info: The following data was reported by the failing machine: Hp lp2000 2proc 866mhz 1.2 gb mem,internal SCSI2 connect to JBOD. This Bug is believed to either be the same or similar to Bug #66251. Unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c015972a *pde = 00000000 Oops: 0000 Kernel 2.4.9-e.3smp CPU: 0 EIP: 0010:[<c015972a>] Not tainted EFLAGS: 00010207 EIP is at invalidate_list [kernel] 0xda eax: c0defc00 ebx: 00000000 ecx: 00000001 edx: cad10000 esi: c22d7f7c edi: 00000000 ebp: cad11f28 esp: cad11efc ds: 0018 es: 0018 ss: 0018 Process umount (pid: 13407, stackpage=cad11000) Stack: cad10000 00000000 00000000 cad11f28 cad11f28 c0defc00 08052500 c015977f c02f44e8 c0defc00 cad11f28 cad11f28 cad11f28 c0defc00 c9987f20 c02f5640 08052500 c0149f5f c0defc00 c02f5680 cad11f88 00000000 f6546620 08052500 Call Trace: [<c015977f>] invalidate_inodes [kernel] 0x2f [<c0149f5f>] kill_super [kernel] 0xaf [<c014e4e9>] path_release [kernel] 0x29 [<c015c280>] do_umount [kernel] 0x1c0 [<c015c37b>] sys_umount [kernel] 0xcb [<c012e633>] sys_munmap [kernel] 0x33 [<c015c3ac>] sys_oldumount [kernel] 0xc [<c010715b>] system_call [kernel] 0x33 Code: 8b 3b 3b 5c 24 20 0f 85 5a ff ff ff 8b 54 24 04 8b 44 24 08 Kernel panic: not continuing
The Storage Router Business Unit of Cisco Systems is also seeing this problem when running some of our storage test suites on both e.3 and e.10 enterprise kernels. For now, we've reduced the frequency of umounts in our test scripts in order to lessen the impact. I can provide additional kernel oops text if needed. We see the problem most often when a umount occurs while large amounts of filesystem I/O are occuring (to other devices). I haven't yet checked to see if unmounting multiple filesystems at the same time ie needed to trigger the bug. My test script doesn't specifically do that, but it doesn't avoid it either. I typically get an oops in under an hour. The machine is effectively useless after that.
Larry, I assume this will be in AS2.1Q2 errata. Correct?
Fixed in kernel-2.4.9-e.8 Larry Woodman
I'm experiencing this problem in 7.2, 2.4.9-31. I can't tell from the bug listing what I do to fix it. Unable to handle kernel paging request at virtual address 3c000045 kernel: printing eip: kernel: c011423e kernel: *pde = 00000000 kernel: Oops: 0002 kernel: Kernel 2.4.9-31 kernel: CPU: 0 kernel: EIP: 0010:[add_wait_queue_exclusive+30/48] Not tainted kernel: EIP: 0010:[<c011423e>] Not tainted kernel: EFLAGS: 00010002 kernel: EIP is at add_wait_queue_exclusive [kernel] 0x1e kernel: eax: c5e446a0 ebx: 3c000045 ecx: c637fe54 edx: c637fe4c kernel: esi: 00000282 edi: c1071978 ebp: 00000000 esp: c637fe40 kernel: ds: 0018 es: 0018 ss: 0018 kernel: Process umount (pid: 32715, stackpage=c637f000) kernel: Stack: c5e44694 c637e000 c0105c5b 00000001 c637e000 c5e446a0 3c000045 c1071978 kernel: 00000000 c0105dc0 c5e44694 c5e44600 c1071978 c880699d 00000001 c1071978 kernel: c1071978 c1071978 c1071978 00000000 c880f4c2 c5e44600 c1071978 00000000
Closing, since it was fixed on 3/7, and 7.2 is no longer being supported.