Bug 436622

Summary: IOAPIC problems with interrupts on RT kernel
Product: Red Hat Enterprise MRG Reporter: Clark Williams <williams>
Component: realtime-kernelAssignee: Jon Masters <jcm>
Status: CLOSED WORKSFORME QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: betaCC: bhu, jcm, srostedt
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-03-13 18:52:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Clark Williams 2008-03-08 15:13:56 UTC
Description of problem:

Originally seen on Dell PowerEdge 1950s with Megaraid SAS, it looks like some
configurations of multiple IOAPICs start to malfunction after some number of
interrupt cycles.  On the PE1950+megasas the symptom is that the adapter stops
interrupting for request completions

Version-Release number of selected component (if applicable):
All version of RT kernel

How reproducible:

Consistently reproduceable on PowerEdge 1950 with Megaraid SAS driver

Steps to Reproduce:

1. Boot into rt kernel on PE1950
2. run 'dd-of-death' (while true; do dd if=/dev/sda of=/dev/null; done)
3. Look for megasas console messages about waiting for outstanding commands
  
Actual results:

Box should go into a state where the disk adapter waits for command that have
already completed but completion interrupts were missed.

Expected results:

No missed interrupts

Additional info:

So far, we've only seen this on systems with multiple IOAPICS and the
misbehaving APIC is a secondary one, usually embedded in some "super I/O" part. 

We currently have two workarounds:

1. a PCI quirk that recognized problematic systems and goes through some
interrupt type gyrations, changing the interrupt from level to edge triggered
temporarily. This reprogramming of the IOAPIC seems to prevent the IOAPIC from
losing interrupts

2. Boot the system with the noapic kernel command line. This seems to work wellm
but will be problematic on large systems with lots of interrupt sources.

Comment 1 Jon Masters 2009-03-13 18:52:28 UTC
I think we can close this for now.