Bug 158039 - nfsd oopses on testing kernel update for FC3
nfsd oopses on testing kernel update for FC3
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Steve Dickson
Brian Brock
Depends On:
  Show dependency treegraph
Reported: 2005-05-17 21:10 EDT by Alexandre Oliva
Modified: 2007-11-30 17:11 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2005-05-19 10:10:41 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
Oopses (18.74 KB, text/plain)
2005-05-17 21:11 EDT, Alexandre Oliva
no flags Details

  None (edit)
Description Alexandre Oliva 2005-05-17 21:10:26 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.7.8) Gecko/20050512 Fedora/1.0.4-2 Firefox/1.0.4

Description of problem:
Got all of these oopses on the same box over the past few weeks, running various different kernels.  It might be faulty hardware, so take it with a grain of salt, but I don't have any other boxes with identical hardware configuration to tell whether it's something specific to the set of modules involved, nor easy local access to run hardware tests.  There are two ext3 oopses and some nfsd oopses from the stable kernel as well, could this all be caused filesystem corruption?  I'm thinking of bringing the system down for an fsck.

Version-Release number of selected component (if applicable):

How reproducible:
Didn't try

Steps to Reproduce:
1.Boot up either the stable or the testing 2.6.11 FC3 kernel and let it run for days.

Actual Results:  Oopses I'll attach.

Expected Results:  No such oopses.

Additional info:
Comment 1 Alexandre Oliva 2005-05-17 21:11:26 EDT
Created attachment 114493 [details]
Comment 2 Alexandre Oliva 2005-05-17 23:00:59 EDT
fsck didn't find any inconsistencies, but a local user reported some recent
suspicion on overheating, and the failures appear to be related with peak use.
Comment 3 Steve Dickson 2005-05-18 07:38:07 EDT
Oops are never good for data integrity. 
Why do you think this is faulty hardware? 
Comment 4 Alexandre Oliva 2005-05-18 11:15:15 EDT
That was the suspicion of another sysadmin.  Apparently the box has never been
exactly rock solid, with some programs crashing every now and then, odd messages
on cron mail, and so on, but this had never (apparently) affected its ability to
serve out filesystems over nfs.  The box was recently taken off to a computer
repair facility at the uni, and they suspected the goop that attaches the cooler
to the processor might be at fault, and replaced it, but that had no effect
whatsoever.  If anything, crashes are now more frequent.

Besides, we have many other boxes running NFS servers with the very same
software, although not exactly the same hardware, so I found it unlikely that
things would crash so often for one box and not for others.  This one isn't even
the most heavily used server.  I figured, if such oopses should be hitting
others, you'd know about it, so I thought I'd file it, but don't waste too much
time on it until we can get better assurance that it's not caused by hardware
problems.  I've downgraded to 2.6.10-1.670_FC3 yesterday, and now the box is off
line.  I can't tell whether it crashed or was taken to the repair facility
again.  Aah, the wonders of being a remote sysadmin :-)
Comment 5 Alexandre Oliva 2005-05-19 10:10:41 EDT
The box failed again, and was taken to the repair office again.  They ran a
memtest again, and found both memory modules to be defective.  I'll probably
have to go on site and verify the testing, but we're now pretty sure it's
hardware failure.  Sorry about the noise.

(s/1.670_FC3/1.770_FC3/ in the previous comment, BTW)

Note You need to log in before you can comment on or make changes to this bug.