Description of problem: Complete IO system hangs under NFS stress. Version-Release number of selected component (if applicable): Kernel 2.6.17-1.2174_FC5 How reproducible: I am copying large files (each > 1GB) to a remote NFS share (8k block size, tcp). Using dstat, I see that FC5 reads about 45 MB from disk (XFS) and then writes 45 MB to network and so on. While copying, I start 2 DVB recordings to the same disk (maybe this could be 2 other processes writing, if this is not DVB related). After 5-10 minutes, the complete IO system hangs. WaitIO is at 99%. DVB driver can't write to disk anymore. Copying does not progress anymore. Not only this filesystem hangs, because when I try to copy a file from and to another disk, this copies some MB (1-18) and then hangs also. So, the hang is not partition or disk specific. When I CTRL-C the large NFS copy process (works mostly), then everything goes back to normal again and continues. One time, I think, it happened even with no recordings running, but with the recordings, it is easily reproducable. Steps to Reproduce: 1. Initiate the copy of the large files to remote NFS share 2. start 2 DVB recordings (or maybe other write processes) Actual results: IO system hangs. Expected results: IO system should not hang. Additional info: I tried to reproduce this under 2.6.16, but there my DVB card is not being recognized. It does not happen, if I copy the files by running the cp command on the remote computer and using this computer as a remote share for the other one. Also, it does not seem to happen, if I set NFS block size to 1k. I even tried with another router with the same effect. This computer has a Netgear 100 MBit network card: 02:0e.0 Ethernet controller: National Semiconductor Corporation DP83815 (MacPhyter) Ethernet Controller Subsystem: Netgear FA311 / FA312 (FA311 with WoL HW)
I just managed to create a similar situation just by copying large files to a remote NFS share (1k block size, remote is Suse 10.1, Kernel 2.6.16)) until the remote share is 100% full. The local cp process does not recognize this and hangs (not killable). After a while (while a third remote computer copies data from this one over NFS), the complete IO system hangs like described above. So, it has nothing to do with the DVB drivers.
A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you.
This bug has been mass-closed along with all other bugs that have been in NEEDINFO state for several months. Due to the large volume of inactive bugs in bugzilla, this is the only method we have of cleaning out stale bug reports where the reporter has disappeared. If you can reproduce this bug after installing all the current updates, please reopen this bug. If you are not the reporter, you can add a comment requesting it be reopened, and someone will get to it asap. Thank you.