Bug 214750 - Fedora "fails" after heavy IO (disk + NFS)
Fedora "fails" after heavy IO (disk + NFS)
Status: CLOSED WONTFIX
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
6
x86_64 Linux
medium Severity high
: ---
: ---
Assigned To: Steve Dickson
Brian Brock
bzcl34nup
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-11-09 04:44 EST by Richard Underwood
Modified: 2008-08-02 19:40 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-05-06 12:45:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Richard Underwood 2006-11-09 04:44:52 EST
Description of problem:

After a period (approximately 5 minutes) of heavy IO, particularly NFS, the
system fails without any obvious error message. Some existing sessions continue
to work for a short time, but hang shortly afterwards - I suspect due to disk
access. The console re-prompts login when no username is given, but hangs when a
username is entered.

A hardware reset is required, and in at least one case has led to filing system
corruption.

This problem appeared to occur in late kernels of FC5 as well as all FC6
kernels, including the latest test kernel. Compiling kernel-2.6.16-1.2122_FC5
(which is the last kernel I know to be good) on FC6 has allowed the servers to
continue running.

The problem has occurred with an i386 installation (FC5) and an x86_64
installation (FC6) on similar (but different) Dell hardware.

Version-Release number of selected component (if applicable):

Confirmed in:
kernel-2.6.18-1.2835.fc6
kernel-2.6.18-1.2798.fc6

Not present in:
kernel-2.6.16-1.2122_FC5

How reproducible:

Very, although time to fail is not predictable.

Steps to Reproduce:
1. Set up two servers with large disks cross-mounted (I had: NFSv3, UDP)
2. Run 5 rsync jobs on each server copying over NFS, not using a daemon. (10G
each should be sufficient).
3. Wait until NFS not responding errors are reported on one of the servers.
  
Actual results:

One server (which is not predictable) fails. The other server will report NFS
errors but recover.

Expected results:

The rsync jobs should complete!

Additional info:

There is no evidence of a problem on the system other than a hang - there are no
console messages or anything in syslog.
Comment 1 Steve Dickson 2007-08-29 19:53:28 EDT
Two things. 

1) Does this happen with TCP mounts?
2) Does this happen with more recent kernels?
Comment 2 Bug Zapper 2008-04-04 00:33:12 EDT
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers
Comment 3 Bug Zapper 2008-05-06 12:45:32 EDT
This bug is open for a Fedora version that is no longer maintained and
will not be fixed by Fedora. Therefore we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen thus bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.