172465 – NFS share hangs

Bug 172465 - NFS share hangs

Summary: NFS share hangs

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	4.0
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Steve Dickson
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2005-11-04 21:22 UTC by Glen
Modified:	2012-06-20 16:12 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2012-06-20 16:12:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
tcpdump on the nfs server (133.36 KB, text/plain) 2005-11-04 21:22 UTC, Glen	no flags	Details
View All

Description Glen 2005-11-04 21:22:58 UTC

Description of problem:
I'm trying to uncompress a file that resides on our nfs share with gzip.  The
file is roughly 30 MB in size and uncompresses to around 600 MB.  When the newly
created uncompressed file gets to around 407 MB in size then gunzip seems to
hang. On the client, the machine that's running gunzip I see basically no cpu
activity. However, on the server, nfsd processes are using up a good portion of
cpu time.  I see nothing strange in dmesg on either machine.  I see no errors
when I run ifconfig on either machine.  And on the client, nfsstat shows out of
around a million packets, only around 30 retransmissions.  Running iptraf on
both the server and client shows a flurry of traffic between the two machines. 
If it's any help I ran tcpdump on the server once the client started to hang,
and I'm attaching the output.  Within the dump file, ammoniag.rmc.cert.ucr.edu
is the server and 10.255.255.254 is the client.

Version-Release number of selected component (if applicable):
kernel-smp-2.6.9-22.0.1.EL

How reproducible:
Unknown

Steps to Reproduce:
1. Uncompress a file on an nfs share.
2.
3.
  
Actual results:
It hangs.

Expected results:
I'd expect it to just uncompress the file.

Additional info:

Comment 1 Glen 2005-11-04 21:22:59 UTC

Created attachment 120742 [details]
tcpdump on the nfs server

Comment 2 Steve Dickson 2005-11-21 11:18:35 UTC

Would it be possible to post a bzip2 binary tethereal network
trace (i.e. tethereal -w /tmp/data.pcap host <server>)  and
then post a system backtrace on the server by doing
an "echo t > /proc/sysrq-trigger". Note: the backtrace will be in
/var/log/messages....

But looking at the tcpdump  you have posted, it really appears
the server is just going away... Note the last 5 nfs messages
are NULL which tells me the client is trying to "ping" the
server to see if its still there...

Comment 3 Glen 2005-11-21 20:47:45 UTC

Actually, I had two different RAID units attached to this one server.  When I
split them up so they were each on a seperate server my problems went away.

However, I have the exact same setup going on a Fedora Core 4 machine and it has
been perfectly fine.  Seems to me that something is seriously wrong with I/O on
RHEL 4.  In fact, I did a few benchmarks with dbench on both OS's and FC4 gets
me around double the performance on the exact same hardware.

Comment 5 Jiri Pallich 2012-06-20 16:12:50 UTC

Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. 
Please See https://access.redhat.com/support/policy/updates/errata/

If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.

Note You need to log in before you can comment on or make changes to this bug.