Bug 175828

Summary: system crash on unmount and mount occurring
Product: Red Hat Enterprise Linux 3 Reporter: Mike Ciccarelli <mc7>
Component: kernelAssignee: Jeff Layton <jlayton>
Status: CLOSED WONTFIX QA Contact: Ben Levenson <benl>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: staubach, steved
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 18:49:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mike Ciccarelli 2005-12-15 15:53:15 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8) Gecko/20051111 Firefox/1.5

Description of problem:
We have 1 system with an Apple Xraid server attached via fibre channel connection. The system will write data fine locally without any problems to the array. We then exported the file system via NFS using tcp so our linux clients could write data to it. System runs fine for around 2-4 days without any problems. User submit large job that puts system under some higher load. Mounts will occur but then the system crashes without any errors or output. This is all we have in messages:

Dec 14 20:14:44 patemp1 rpc.mountd: authenticated unmount request from pace926:969 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:15:16 patemp1 rpc.mountd: authenticated unmount request from pace912:720 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:15:40 patemp1 rpc.mountd: authenticated unmount request from pace767:867 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:15:43 patemp1 rpc.mountd: authenticated unmount request from pace749:606 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:15:45 patemp1 rpc.mountd: authenticated mount request from pace737:645 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:15:52 patemp1 rpc.mountd: authenticated mount request from pace776:955 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:16:06 patemp1 rpc.mountd: authenticated mount request from pace930:879 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:16:10 patemp1 rpc.mountd: authenticated unmount request from pace755:600 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:16:35 patemp1 rpc.mountd: authenticated unmount request from pace938:805 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:16:41 patemp1 rpc.mountd: authenticated mount request from pace409:894 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:16:47 patemp1 rpc.mountd: authenticated unmount request from pace923:777 for /export/farmtmp1 (/export/farmtmp1)
Dec 14 20:20:41 patemp1 syslogd 1.4.1: restart.
Dec 14 20:20:41 patemp1 syslog: syslogd startup succeeded
Dec 14 20:20:41 patemp1 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Dec 14 20:20:41 patemp1 kernel: ok
Dec 14 20:20:41 patemp1 kernel: Bootdata ok (command line is ro root=LABEL=/ console=tty0 console=ttyS0,9600)
Dec 14 20:20:41 patemp1 kernel: Linux version 2.4.21-37.ELsmp (bhcompile.redhat.com) (gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-53)) #1 SMP Wed Sep 7 13:32:18 EDT 2005
Dec 14 20:20:41 patemp1 kernel: BIOS-provided physical RAM map:
Dec 14 20:20:41 patemp1 kernel:  BIOS-e820: 0000000000000000 - 000000000009a000 (usable)
Dec 14 20:20:41 patemp1 kernel:  BIOS-e820: 000000000009a000 - 00000000000a0000 (reserved)
Dec 14 20:20:41 patemp1 kernel:  BIOS-e820: 00000000000d0000 - 0000000000100000 (reserved)
Dec 14 20:20:41 patemp1 kernel:  BIOS-e820: 0000000000100000 - 000000007ff70000 (usable)
Dec 14 20:20:41 patemp1 kernel:  BIOS-e820: 000000007ff70000 - 000000007ff77000 (ACPI data)

Would like more help or information on how to debug the system when the kernel crashes to determine what needs to be updated or changed. At this point we're running update 6 of WS 3 with the latest patches. If we can just get over this problem of unscheduled reboots/outages we would be running fine.

Version-Release number of selected component (if applicable):
nfs-utils-1.0.6-42EL

How reproducible:
Always

Steps to Reproduce:
1. Have 90 systems unmount and remount to 1 nfs linux server
2. mounts are using tcp/ip
3. The linux server eventually crashes under load. No error messages are displayed. Not sure on how to precede if I have nothing to debug, only see that the system rebooted.

Actual Results:  System restarts completely without any debugging information.

Expected Results:  System should just keep responding normally.

Additional info:

Comment 1 Jeff Layton 2007-07-17 10:43:24 UTC
Mike,
   Is this still an issue on RHEL3U9?


Comment 2 RHEL Program Management 2007-10-19 18:49:55 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.