Bug 695387 - NFS Workloads triggering System Hang [NEEDINFO]
Summary: NFS Workloads triggering System Hang
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.4
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Red Hat Kernel Manager
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-11 14:45 UTC by sstephens
Modified: 2014-06-03 12:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-06-03 12:29:22 UTC
Target Upstream Version:
pm-rhel: needinfo? (sstephens)


Attachments (Terms of Use)
cacti graphs of nfs performance (378.95 KB, application/zip)
2011-04-11 14:45 UTC, sstephens
no flags Details

Description sstephens 2011-04-11 14:45:18 UTC
Created attachment 491251 [details]
cacti graphs of nfs performance

Description of problem:

     Under increased NFS load, above 1.5k write_req/sec and above 50Mbit/sec bandwidth utilization, the system hangs and becomes unresponsive to all interrupts. The system resumes after 15-30 minutes with no evidence of the root cause. The dmesg and other system logs fail to record any error and there are no core files created.

Version-Release number of selected component (if applicable):

     kernel-2.6.18-194.26.1.el5 x86_64

How reproducible:
     
     Anytime the number of write_req's reaches or exceeds roughly 1.5k requests/sec and seemingly coupled with an interface bandwidth usage above 50 megabits/sec.

Steps to Reproduce:
1. Using Iozone http://www.iozone.org/ to run benchmark tests from NFS client to NFS mount point on affected server.
2. Setup tests to produce 1.5k or more write requests/sec with 2MB file sizes.
3. NFS Server becomes unresponsive
  
Actual results:

     NFS Server becomes unresponsive until test ends or the write requests are allowed to taper off.

Expected results:

     NFS Server should be able to handle this traffic level. This same hardware was upgraded from RHEL4 to RHEL5 and physical memory increased. NFS request traffic has remained the same from before OS upgrade. After upgrade to RHEL5, the server would start hanging during the increased NFS load periods.

Additional info:
     The hardware is a Dell Poweredge 1850, with 8Gb of RAM. It is connected to an EMC Clariion CX-500 via Qlogic Fiber HBA adapters. The connections to the SAN are managed by PowerPath. The kernel is at the highest supported level of the PowerPath drivers. PowerPath is version 5.3 SP1. I'm attaching the graphs of NFS server performace from Cacti. The gaps in the graphs are the periods when the host stops responding.

Comment 1 Ric Wheeler 2011-04-11 14:53:03 UTC
It would be great if you can file this via the Red Hat support channels - our support people are great at helping triage and pull together data. If you don't have a subscription, you can always post a summary of the issue to linux-nfs (our nfs team follows that very closely).

RH bugzilla is not meant to be used as a front line support tool, thanks!

Comment 2 RHEL Program Management 2014-03-07 13:41:50 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 3 RHEL Program Management 2014-06-03 12:29:22 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).


Note You need to log in before you can comment on or make changes to this bug.