Bug 74054

Summary: Kernel oops while executing umount
Product: Red Hat Enterprise Linux 2.1 Reporter: John Fowlkes <john_fowlkes>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED CURRENTRELEASE QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 2.1CC: nancy.dockery, sferris, tao, tburke
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-01-12 20:37:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description John Fowlkes 2002-09-13 22:41:55 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 4.0; Hewlett-Packard 
IE5.5-SP2)

Description of problem:
Kernel Panics during umount. During file system testing between one and six 
hours of successfull testing. A umount will not be handled correctly by the 
kernel. At this point it then appears that all system resources are put toward 
this process which is always a umount which seems to panic the kernel which in 
turns causes all processes on given machine to time out. 

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1.Mount Numerous file systems 
2.Execute lots of operations on all of the file systems 
3.Umount several file systems in paralell.
	

Actual Results:  Oops

Expected Results:  No Oops

Additional info:

The following data was reported by the failing machine: Hp lp2000 2proc 866mhz 
1.2 gb mem,internal SCSI2 connect to JBOD. This Bug is believed to either be 
the same or similar to Bug #66251.

Unable to handle kernel NULL pointer dereference at virtual address 00000000
 printing eip:
c015972a
*pde = 00000000
Oops: 0000
Kernel 2.4.9-e.3smp
CPU:    0
EIP:    0010:[<c015972a>]    Not tainted
EFLAGS: 00010207
EIP is at invalidate_list [kernel] 0xda 
eax: c0defc00   ebx: 00000000   ecx: 00000001   edx: cad10000
esi: c22d7f7c   edi: 00000000   ebp: cad11f28   esp: cad11efc
ds: 0018   es: 0018   ss: 0018
Process umount (pid: 13407, stackpage=cad11000)
Stack: cad10000 00000000 00000000 cad11f28 cad11f28 c0defc00 08052500 c015977f 
       c02f44e8 c0defc00 cad11f28 cad11f28 cad11f28 c0defc00 c9987f20 c02f5640 
       08052500 c0149f5f c0defc00 c02f5680 cad11f88 00000000 f6546620 08052500 
Call Trace: [<c015977f>] invalidate_inodes [kernel] 0x2f 
[<c0149f5f>] kill_super [kernel] 0xaf 
[<c014e4e9>] path_release [kernel] 0x29 
[<c015c280>] do_umount [kernel] 0x1c0 
[<c015c37b>] sys_umount [kernel] 0xcb 
[<c012e633>] sys_munmap [kernel] 0x33 
[<c015c3ac>] sys_oldumount [kernel] 0xc 
[<c010715b>] system_call [kernel] 0x33 


Code: 8b 3b 3b 5c 24 20 0f 85 5a ff ff ff 8b 54 24 04 8b 44 24 08 
 
Kernel panic: not continuing

Comment 2 Scott M. Ferris 2003-01-30 21:51:12 UTC
The Storage Router Business Unit of Cisco Systems is also seeing this problem
when running some of our storage test suites on both e.3 and e.10 enterprise
kernels. For now, we've reduced the frequency of umounts in our test scripts in
order to lessen the impact.  I can provide additional kernel oops text if
needed.  We see the problem most often when a umount occurs while large amounts
of filesystem I/O are occuring (to other devices).  I haven't yet checked to see
if unmounting multiple filesystems at the same time ie needed to trigger the
bug.  My test script doesn't specifically do that, but it doesn't avoid it
either. I typically get an oops in under an hour.  The machine is effectively
useless after that.

Comment 3 Larry Troan 2003-02-17 16:47:50 UTC
Larry, I assume this will be in AS2.1Q2 errata. Correct?

Comment 4 Larry Woodman 2003-03-07 14:58:47 UTC
Fixed in kernel-2.4.9-e.8

Larry Woodman


Comment 5 Nancy Dockery 2003-07-30 01:03:26 UTC
I'm experiencing this problem in 7.2, 2.4.9-31. I can't tell from the bug 
listing what I do to fix it.
 
Unable to handle kernel paging request at virtual address 3c000045
kernel:  printing eip:
kernel: c011423e
kernel: *pde = 00000000
kernel: Oops: 0002
kernel: Kernel 2.4.9-31
kernel: CPU:    0
kernel: EIP:    0010:[add_wait_queue_exclusive+30/48]    Not tainted
kernel: EIP:    0010:[<c011423e>]    Not tainted
kernel: EFLAGS: 00010002
kernel: EIP is at add_wait_queue_exclusive [kernel] 0x1e 
kernel: eax: c5e446a0   ebx: 3c000045   ecx: c637fe54   edx: c637fe4c
kernel: esi: 00000282   edi: c1071978   ebp: 00000000   esp: c637fe40
kernel: ds: 0018   es: 0018   ss: 0018
kernel: Process umount (pid: 32715, stackpage=c637f000)
kernel: Stack: c5e44694 c637e000 c0105c5b 00000001 c637e000 c5e446a0 3c000045 
c1071978 
kernel:        00000000 c0105dc0 c5e44694 c5e44600 c1071978 c880699d 00000001 
c1071978 
kernel:        c1071978 c1071978 c1071978 00000000 c880f4c2 c5e44600 c1071978 
00000000 


Comment 6 Suzanne Hillman 2004-01-12 20:37:16 UTC
Closing, since it was fixed on 3/7, and 7.2 is no longer being supported.