Bug 80527

Summary: kernel (2.4.18-18.7.x) deadlock in IDE subsystem
Product: [Retired] Red Hat Linux Reporter: Ion Badulescu <ionut>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 7.3   
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: 2.4.20-28 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-05-19 15:52:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
sysrq output on the PIII/450, first incident
none
sysrq output on the PIII/450, second incident
none
sysrq output on the PIII/450, third incident
none
sysrq output on the dual athlon none

Description Ion Badulescu 2002-12-27 17:35:07 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003

Description of problem:
I'm experiencing kernel lockups with the errata kernels, on 2 boxes that were
previously stable running 2.4.18-10. The kernel seems to deadlock waiting on
some IDE event, while the IDE chassis light is stuck in the "solid on" mode. The
rest of the system is ok (an existing shell prompt still works, that is until it
needs to touch the disk).

The only similarities that I can see between the two boxes are:

1. Both are running software raid (one is raid5 over 3ware pseudo-scsi disks,
the other one raid1 over ide pdc202xx). Probably unrelated, though.

2. Both have IDE system disks, although that's not a distinctive feature within
my server farm, by any means.

3. Both run IO-intensive jobs at time, but again this is not a distinctive feature.

One of the machines (the most deadlock-happy of the two) is a Dell PIII/450,
piix4 system disk, running the i686 kernel. The other one is a dual Athlon
MP/1500+, amd760MP system disk, running the athlon-smp kernel.

I'll attach the sysrq-t (tasks) output from the time when the machines were
deadlocked.

This has happened at least 5-6 times over the last month. Also, I don't see
anything in the 2.4.18-19 kernel that would address this issue.

Lastly, it's not reproducible at will.



Version-Release number of selected component (if applicable):


How reproducible:
Couldn't Reproduce


Additional info:

Comment 1 Ion Badulescu 2002-12-27 17:47:15 UTC
Created attachment 88945 [details]
sysrq output on the PIII/450, first incident

Comment 2 Ion Badulescu 2002-12-27 17:47:59 UTC
Created attachment 88946 [details]
sysrq output on the PIII/450, second incident

Comment 3 Ion Badulescu 2002-12-27 17:51:13 UTC
Created attachment 88947 [details]
sysrq output on the PIII/450, third incident

As a side note, sysrq-r (Show Regs) threw the kernel into an infinite loop of
printing out the stack trace. I had to push the reset button (a few hours
later, on Christmas eve, after driving in...) because sysrq had become
unresponsive under the printk flood.

Comment 4 Ion Badulescu 2002-12-27 17:51:43 UTC
Created attachment 88948 [details]
sysrq output on the dual athlon

Comment 5 Ion Badulescu 2004-05-19 15:52:04 UTC
One of the errata kernel releases seems to have fixed this, probably
when it moved to 2.4.21pre.

Anyway.. RH73 is EOL and I can't reproduce this anymore, there is no
point in leaving this report open so I'm closing it with an ERRATA
resolution.