From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003
Description of problem:
I'm experiencing kernel lockups with the errata kernels, on 2 boxes that were
previously stable running 2.4.18-10. The kernel seems to deadlock waiting on
some IDE event, while the IDE chassis light is stuck in the "solid on" mode. The
rest of the system is ok (an existing shell prompt still works, that is until it
needs to touch the disk).
The only similarities that I can see between the two boxes are:
1. Both are running software raid (one is raid5 over 3ware pseudo-scsi disks,
the other one raid1 over ide pdc202xx). Probably unrelated, though.
2. Both have IDE system disks, although that's not a distinctive feature within
my server farm, by any means.
3. Both run IO-intensive jobs at time, but again this is not a distinctive feature.
One of the machines (the most deadlock-happy of the two) is a Dell PIII/450,
piix4 system disk, running the i686 kernel. The other one is a dual Athlon
MP/1500+, amd760MP system disk, running the athlon-smp kernel.
I'll attach the sysrq-t (tasks) output from the time when the machines were
This has happened at least 5-6 times over the last month. Also, I don't see
anything in the 2.4.18-19 kernel that would address this issue.
Lastly, it's not reproducible at will.
Version-Release number of selected component (if applicable):
Created attachment 88945 [details]
sysrq output on the PIII/450, first incident
Created attachment 88946 [details]
sysrq output on the PIII/450, second incident
Created attachment 88947 [details]
sysrq output on the PIII/450, third incident
As a side note, sysrq-r (Show Regs) threw the kernel into an infinite loop of
printing out the stack trace. I had to push the reset button (a few hours
later, on Christmas eve, after driving in...) because sysrq had become
unresponsive under the printk flood.
Created attachment 88948 [details]
sysrq output on the dual athlon
One of the errata kernel releases seems to have fixed this, probably
when it moved to 2.4.21pre.
Anyway.. RH73 is EOL and I can't reproduce this anymore, there is no
point in leaving this report open so I'm closing it with an ERRATA