Red Hat Bugzilla – Bug 89141
Machine oopses, then hangs
Last modified: 2007-04-18 12:53:08 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003
Description of problem:
On random occasions (no pattern has been established yet), the kernel first
oopses, services will stop responding, until finally the machine panics and
needs to be rebooted.
The box had similar problems two weeks ago; suspecting hardware problems
(we have 2.4.18-24.7 and -27.7 running on other boxen without problems
I changed the whole box with a spare. Yesterday the machine hung twice, today again.
relevant oopses are attached, as is the boot log from the last boot, for
I was running -24 on this box as I rely on /proc/cmdline (see #88047)
and do not have any users on the box so ptrace was not an issue for me.
As of today I gave -27 a try, albeit I do not expect this to help.
Please notice in the latest dump that now the lmsensors modules were loaded - I
supervised temperature and fan rpm to rule out temperature
related problems. Values were all fine.
Also, RAM is ECC, so a failure is - albeit always possible - unlikely.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1.Reboot the box to get it working
Actual Results: Sometimes, kernel oopses, as detailed above. Automatic service
monitor pages me in the middle of the night.
Expected Results: Flawless performance without oops; good night of
Created attachment 91185 [details]
Several kernel oopses from /var/log/messages
As a side note: I wanted to look into the 2.4.18-24.7.x source rpm again to notice
that all the mirrors have already erased it.
Is there some publically accessible ftp site which has all the old updates ?
(google is futile - it finds all the mirrors but the files have gone by now)
Closing due to proven hardware defect. The mainboard in question finally died due
to faulty electrolytic capacitors. The other one had RAM problems. I guess PC
hardware currently is in such a sorry state of affairs that failover for any
application needs to become standard....