Bug 143875

Summary: need noapic, or else keventd spins wildly
Product: Red Hat Enterprise Linux 3 Reporter: Eric Hagberg <hagberg>
Component: kernelAssignee: Jim Paradis <jparadis>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: andriusb, dmair, gunther.mayer, lwang, martin.bowers, mkapoor, peterm, petrides, riel, smann, vanhoof
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-02 21:10:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 170417, 186960    
Attachments:
Description Flags
Alt+SysRq+T when keventd spins none

Description Eric Hagberg 2004-12-30 16:23:48 UTC
Description of problem:
Something regressed in the 2.4.21-27(.0.1).EL kernel. Between then and
2.4.21-15.EL, I'd noticed that keventd started spinning on a cpu.
Going back to 2.4.21-15.EL makes the problem go away.

If I specify "noapic" as a boot parameter, then keventd behaves again,
but that really shouldn't be necessary.


Version-Release number of selected component (if applicable):
2.4.21-27.0.1.EL

How reproducible:
always, at least on the amd64 machine I've got

Steps to Reproduce:
1. See above
2.
3.
  
Actual results:


Expected results:
keventd shouldn't spin on a cpu, even w/o "noapic"

Additional info:

Comment 1 Martin Bowers 2005-06-03 15:22:51 UTC
I am seeing similar behavior on several 2.4.21-32.EL x86_64 systems.  keventd
will start running out of control.  I can't reproduce it on call, but I will try
to get more info when it comes up again.  Any requests for info to gather?

Comment 2 Martin Bowers 2005-07-20 21:18:52 UTC
My U5 systems were seeing this consistantly, however after passing the noapic
option to the kernel as suggested the issue disappeared.  Any explanation for
this yet?

Comment 11 Bastien Nocera 2005-10-14 13:49:04 UTC
Created attachment 119972 [details]
Alt+SysRq+T when keventd spins

Comment 12 Bastien Nocera 2005-10-14 13:52:55 UTC
The above messages were generated with a SUN w2100z machine, with the latest BIOS.

Jim, let me know whether you need more information.

Comment 14 Jim Paradis 2005-10-20 19:45:43 UTC
This is interesting... according to the attached logfiles, the reporter
attempted to turn off APIC by specifying "apic=off".  The correct parameter to
use for this purpose is "noapic".  In theory, "apic=off" should have no effect
whatsoever, yet the reporter claims that it makes a difference.

I really could use a sysreport on the affected system for more data.  I could
also use some indications as to how I can reproduce.  What platforms does this
problem occur on?  Does it need a certain system load or application mix to kick
this off?  What are the BIOS revs for the affected platforms?

Comment 15 Eric Hagberg 2005-10-20 21:01:02 UTC
I don't see any evidence of the string apic=off in any files that I can see
attached to this issue. Everyone is saying they used noapic to make things
better. Or perhaps there are non-public files?

Comment 16 Ernie Petrides 2005-10-20 21:54:18 UTC
Putting into NEEDINFO state.  Please answer questions in comment #14.

Comment 19 Jim Paradis 2005-10-27 00:29:53 UTC
Looking over the sysreport, the sheer number of ACPI interrupts suggests that
someone is "holding the button down" as it were.  I suspect it's a polarity or
edge-versus-level triggering issue.  Continuing to investigate.


Comment 34 Jim Paradis 2005-12-15 19:52:59 UTC
I'm still waiting for access to hardware on which this problem can be
reproduced.  If anyone has sent us a w2100z I haven't seen it...


Comment 35 Steve Conklin 2005-12-19 15:18:59 UTC
Hardware needed to reproduce this problem is on a FedEx truck for delivery to
the Raleigh office today. We will provide an update when it is installed and
operational.

Comment 37 David Aquilina 2005-12-19 17:15:35 UTC
Jim, 

A W2100z has been procured and is available for you. Please see Comment #36 for
instructions on how to access it. 

-David

Comment 38 Bastien Nocera 2005-12-20 10:58:01 UTC
Do any of the reporters use machines other than SUN x86-64 machines, and more
precisely, "W2100z" workstations?

Comment 42 Neil Horman 2005-12-20 15:48:21 UTC
One of the other support engineers and I were talking about this issue, and one
of the items that came up was that we thought perhaps the nvidia module might be
registering a callback for a custom ACPI event, which was why our local system
here hasn't seen the problem yet.  Above and beyond that we thought it would be
nice to know what exactly the acpi event was that was taking up so much of
keventd's time.  To this end, I think we can discover the answer to these
questions, if we had a sysrq-c generated vmcore of one of the systems
experiencing high keventd utilization.  With a vmcore, we could walk the
tq_structs on the keventd task list to see which structs call
acpi_os_execute_deferred.  From there we can see what data pointer they pass to
the function, which should represent (amont other items) the callback function
registered for that event.  This should give us some idea of which ACPI event is
being generated so often, and what part of the kernel is handling it (nvidia
module, other module, kernel proper, etc.).

A sysrq-c is prefered here, but if need be I can build an instrumented kernel to
print out the callback pointer when triggered.

Comment 50 Jim Paradis 2006-04-08 00:10:54 UTC
Based on comments in IT89256, I'm beginning to suspect it has something to do
with the ACPI method that handles the SMBALERT interrupt.  The output of the
instrumented kernel shows a giant flood of calls to the _L26 method for GPE 0. 
This suggests that the interrupt is not getting handled and the status is not
getting cleared.

The folks at sunsupport produced an instrumented kernel which shows the flood of
calls to this method.  It *doesn't* show any error returns from the method
invocation, so I believe that the method *is* getting called, but is returning
success without doing the right thing.  This means that either the OS is
invoking the method incorrectly, or the method itself has a bug.

Comment 52 Gunther Mayer 2006-04-21 15:16:07 UTC
Is there sufficient information in the meantime to
a) confirm this is an OS issue
or
b) confirm this is a w2100z issue ?



Comment 57 Larry Woodman 2006-05-05 17:59:06 UTC
If we have the hardware in-house I will do a quick binary search to find the
problem.  I'll take the kernels between 2.4.21-15 and 2.4.21-27 and determine
which one cause this problem.  There are only a handfull of x86_64 changes
between those 2 kernels that could have caused this problem.

Larry Woodman


Comment 59 Jim Paradis 2006-05-10 21:39:15 UTC
I have been poking at this issue remotely from Westford to the system in RDU,
and have determined that the problem cropped up between 2.4.21-15 and 2.4.21-19.
 We are arranging to have a W2100z sent directly to Westford so I can chase this
problem down more efficiently.


Comment 60 Jim Paradis 2006-05-17 21:34:41 UTC
To facilitate debugging, I just received a w2100z from Sun.  I started digging
in.  There appears to have been significant changes in ACPI SCI setup and
handling between U2 (.15) and U3 (.19), which is where this problem starts to
occur.  I noticed that using overrides to set up the SCI interrupt as
edge-triggered active-high made the problem go away, but this is exactly the
opposite of what the ACPI spec says the interrupt should be.  Continuing to
investigate.


Comment 61 Jim Paradis 2006-05-20 01:39:32 UTC
I've done a bunch of digging around in response to what is described in Comment
55, and this is what I come up with:

When I generate a thermal event by disconnecting the case fan, an ACPI event is
generated on GPE 22 and handled via acpi_irq().  Further down the call chain,
acpi_ev_gpe_dispatch() is called which queues the control method to be executed
on behalf of GPE 22.  It then calls acpi_hw_clear_gpe() to clear the
level-triggered event.

Once the interrupt handler is done the queued control method is invoked via
acpi_ev_asynch_execute_gpe_method().  Once it evaluates the method it too calls
acpi_hw_clear_gpe() to clear the level-triggered event.

The question I have is: what *else* needs to be done in order to clear this
event?  Read a particular register from a particular chip?  I thought ACPI was
supposed to abstract this all away; the handler method *should* take care of this...


Comment 68 Jim Paradis 2006-06-02 23:09:12 UTC
I have done some more experimenting, and here is what I have found regarding the
behavior of different RHEL releases:

- 32-bit RHEL3 does not show this issue at all, mainly because it has *very*
limited ACPI support (basically only for a few system-configuration things).

- 64-bit RHEL3 does not show this issue prior to Update 3.  In Update 3 we added
code to the ACPI driver to set up SCI handling.

- 64-bit RHEL4 does not show this issue in this way.  When I run my usual test
case (boot the system, then pull the case fan to generate a thermal event) the
system takes the event, reports the temperature out of spec (68C) and shuts down.

After some more poking around, I find that RHEL4 builds the ACPI thermal driver
into the kernel (CONFIG_ACPI_THERMAL=y) wherease RHEL3 builds it as a module
(CONFIG_ACPI_THERMAL=m).  If I go back and boot up RHEL3 and do a "modprobe
thermal" and *then* pull the case fan, the system shuts down just as RHEL4 does.

I'll do some more investigating to see what an appropriate solution would be. 
In the meantime, adding "modprobe thermal" to your system startup files would be
one way to work around the issue.


Comment 70 Ernie Petrides 2006-06-08 20:45:10 UTC
Steffen, RHEL3 is now closed.  Please ask the customer to use the work-around
in comment #68 or upgrade to RHEL4.