Description of problem: Originally seen on Dell PowerEdge 1950s with Megaraid SAS, it looks like some configurations of multiple IOAPICs start to malfunction after some number of interrupt cycles. On the PE1950+megasas the symptom is that the adapter stops interrupting for request completions Version-Release number of selected component (if applicable): All version of RT kernel How reproducible: Consistently reproduceable on PowerEdge 1950 with Megaraid SAS driver Steps to Reproduce: 1. Boot into rt kernel on PE1950 2. run 'dd-of-death' (while true; do dd if=/dev/sda of=/dev/null; done) 3. Look for megasas console messages about waiting for outstanding commands Actual results: Box should go into a state where the disk adapter waits for command that have already completed but completion interrupts were missed. Expected results: No missed interrupts Additional info: So far, we've only seen this on systems with multiple IOAPICS and the misbehaving APIC is a secondary one, usually embedded in some "super I/O" part. We currently have two workarounds: 1. a PCI quirk that recognized problematic systems and goes through some interrupt type gyrations, changing the interrupt from level to edge triggered temporarily. This reprogramming of the IOAPIC seems to prevent the IOAPIC from losing interrupts 2. Boot the system with the noapic kernel command line. This seems to work wellm but will be problematic on large systems with lots of interrupt sources.
I think we can close this for now.