From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461; .NET
CLR 1.0.3705; .NET CLR 1.1.4322)
Description of problem:
We have a large disk array 1.7 TB (hardware raid 5) with only one partition on
it, formatted with EXT3FS. See the attached log file for kernel error which
appeared. After this error appeared, the filesystem was no more accessible -
every process trying to access it froze.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
Don't know how to reproduce.
Please provide the kernel log showing the assertion failures.
Created attachment 93045 [details]
kernel log showing the assertion failure
I am sorry, I obviously forgot to attach the most important part of the bug
report. Your bugzilla's "new bug wizard" scared me a lot. :-)
This is a known problem. There's a debug check in ext3 which triggers when
we're using a page which is marked uninitialised. Unfortunately, that same
condition can be triggered by IO failures. Please check your logs --- I suspect
you'll find IO failures prior to the panic.
A recent patch to upstream kernels relaxes this check in ext3 to be a warning,
not a panic, so we won't do the impolite kernel oops in this case, and future
releases will use that new behaviour.
Yes, you are right - I found a lot of messages like "kernel: cciss: cmd
c2de6078 has CHECK CONDITION, sense key = 0x3" prior to the panic, which I did
not notice before; cciss is the name of the driver for the raid controller
(Compaq Smart Array). The errors appeared in the log 22 hours prior to the ext3
panic. I am going to check how to get the hardware behaving properly. BTW - is
that relaxed ext3 patch already included in 2.4.20-19.7?
*** This bug has been marked as a duplicate of 86035 ***
Changed to 'CLOSED' state since 'RESOLVED' has been deprecated.