Description of problem: I'm trying to uncompress a file that resides on our nfs share with gzip. The file is roughly 30 MB in size and uncompresses to around 600 MB. When the newly created uncompressed file gets to around 407 MB in size then gunzip seems to hang. On the client, the machine that's running gunzip I see basically no cpu activity. However, on the server, nfsd processes are using up a good portion of cpu time. I see nothing strange in dmesg on either machine. I see no errors when I run ifconfig on either machine. And on the client, nfsstat shows out of around a million packets, only around 30 retransmissions. Running iptraf on both the server and client shows a flurry of traffic between the two machines. If it's any help I ran tcpdump on the server once the client started to hang, and I'm attaching the output. Within the dump file, ammoniag.rmc.cert.ucr.edu is the server and 10.255.255.254 is the client. Version-Release number of selected component (if applicable): kernel-smp-2.6.9-22.0.1.EL How reproducible: Unknown Steps to Reproduce: 1. Uncompress a file on an nfs share. 2. 3. Actual results: It hangs. Expected results: I'd expect it to just uncompress the file. Additional info:
Created attachment 120742 [details] tcpdump on the nfs server
Would it be possible to post a bzip2 binary tethereal network trace (i.e. tethereal -w /tmp/data.pcap host <server>) and then post a system backtrace on the server by doing an "echo t > /proc/sysrq-trigger". Note: the backtrace will be in /var/log/messages.... But looking at the tcpdump you have posted, it really appears the server is just going away... Note the last 5 nfs messages are NULL which tells me the client is trying to "ping" the server to see if its still there...
Actually, I had two different RAID units attached to this one server. When I split them up so they were each on a seperate server my problems went away. However, I have the exact same setup going on a Fedora Core 4 machine and it has been perfectly fine. Seems to me that something is seriously wrong with I/O on RHEL 4. In fact, I did a few benchmarks with dbench on both OS's and FC4 gets me around double the performance on the exact same hardware.
Thank you for submitting this issue for consideration in Red Hat Enterprise Linux. The release for which you requested us to review is now End of Life. Please See https://access.redhat.com/support/policy/updates/errata/ If you would like Red Hat to re-consider your feature request for an active release, please re-open the request via appropriate support channels and provide additional supporting details about the importance of this issue.