Bug 247205

Summary: System hung with Ext-fs error (device dm-5) in start_transaction: Journal has aborted
Product: Red Hat Enterprise Linux 4 Reporter: JB Segal <jb>
Component: kernelAssignee: Eric Sandeen <esandeen>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: low    
Version: 4.4   
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-12-04 21:57:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description JB Segal 2007-07-05 23:47:19 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.4) Gecko/20070515 Firefox/2.0.0.4

Description of problem:
I have no idea what caused this, nor could I run any forensics before rebooting.

A database server, which was quietly sitting there, serving databases, hung.

It was still pingable, and my existing ssh connections didn't die, but any attempted command failed to run.

When I got to the console, the error in the subject was scrolling as fast as it could.

A power-cycle was necessary to restart the machine, which booted fine.

Version-Release number of selected component (if applicable):
kernel-smp-2.6.9-42.0.10.EL

How reproducible:
Didn't try


Steps to Reproduce:
This is a production system and downtime is bad. I hope to never see this failure again. As no one was doing anything on the machine at the time, reproducibility is not likely.

Actual Results:


Expected Results:


Additional info:
We have a fair number of RHEL4u4 systems and this is the 1st time we've seen this.

This is only our 2nd Dell PowerEdge 2950, and the other one is running u5 (which, oddly, isn't a valid choice in the version picker, earlier) and this one will be soon, but as Google shows this bug to have been around for at least 4 years, I'm not wildly hopeful that it's already been fixed.

Comment 1 Eric Sandeen 2007-07-19 15:44:20 UTC
After reboot, was there anything in the system logs prior to the "journal has
aborted" message?  Or was it the root filesystem which had the problem, and so I
suppose no further messages were logged to disk...?

Comment 2 Eric Sandeen 2008-12-04 21:57:58 UTC
This has been in needinfo for over a year; closing.