Bug 172319

Summary: mac fan control causes incorrect load average
Product: [Fedora] Fedora Reporter: Shawn Houston <houston>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Brian Brock <bbrock>
Severity: low Docs Contact:
Priority: medium    
Version: 5CC: jonstanley, pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: powerpc   
OS: Linux   
URL: http://seb.france.free.fr/linux/ibookG4/iBookG4-howto-9.html#ss9.9
Whiteboard: MassClosed
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-01-20 04:41:22 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shawn Houston 2005-11-02 19:26:37 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050921 Red Hat/1.7.12-1.1.3.2

Description of problem:
The mac thermal control modules, therm_pm72 in my case, cause the load average to increase by 1 above what they should be. The we page at
http://seb.france.free.fr/linux/ibookG4/iBookG4-howto-9.html#ss9.9
is how I discovered the root cause. Unloading and reloading the module resets the
load average, but is not a viable solution for a large cluster of xserves.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Run a load on your xserve and wait for the fans to speed up
2. Remove the load
3. Load average never drops below 1.0
  

Actual Results:  Cluster looks like it is heavily loaded in ganglia, actual state is idle.

Expected Results:  The load average should have dropped to close to 0 with no load.

Additional info:

Comment 1 Dave Jones 2005-11-10 20:19:58 UTC
2.6.14-1.1637_FC4 has been released as an update for FC4.
Please retest with this update, as a large amount of code has been changed in
this release, which may have fixed your problem.

Thank you.


Comment 2 Shawn Houston 2005-11-29 23:38:10 UTC
Now running 2.6.14-1.1644_FC4 across our cluster. I have not seen a reoccurance
of this issue since upgrading to 2.6.14-1.1637_FC4. I will need to observe our
cluster under load for a few weeks before I can be absolutely certain that this
issue is resolved.

Comment 3 Shawn Houston 2005-12-16 01:00:02 UTC
This issue has not been observed in several weeks running newer kernels. As far
as I am concerned the problem is solved.

Comment 4 Dave Jones 2005-12-16 01:13:59 UTC
great, thanks.


Comment 5 Shawn Houston 2005-12-18 02:20:06 UTC
Just an update. I upgraded to kernel 2.6.14-1.1653_FC4 today and the problem is 
again manifest. Either a regression, or something has triggered the problem and
it was just hiding. Kernels 2.6.14-1.1637_FC4 and 2.6.14-1.1644_FC4 did not
exhibit the problem as far as I can tell.

Comment 6 Dave Jones 2005-12-20 03:36:37 UTC
does the test kernel at http://people.redhat.com/davej/kernels/Fedora/FC4 do any
better ?


Comment 7 Shawn Houston 2005-12-20 21:00:02 UTC
The ppc64 kernel did not boot. I do not know why. We have an Apple xServe cluster
using clusternodes which do not have a video card. The Fedora kernel does not 
have the console port driver compiled in so I do not have any way of getting
feedback from the system. Testing will have to wait until I have more time in 
early January.

Comment 8 Dave Jones 2006-02-03 06:09:10 UTC
This is a mass-update to all currently open kernel bugs.

A new kernel update has been released (Version: 2.6.15-1.1830_FC4)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO_REPORTER state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

Thank you.


Comment 9 Shawn Houston 2006-02-12 01:43:46 UTC
It is now worse with kernel 2.6.15-1.1830_FC4 and 2.6.15-1.1831_FC4 as the
therm_pm72 module is now built in. The old work around of removing and
installing the module is not longer an option, and the situation persists with
the load level jumping to one every once in a while and just staying there.

Comment 10 Shawn Houston 2006-02-15 18:32:08 UTC
This is too strange. After two days running with a load level of 1.0 without any
apparent load (the symptom) the machine is now reporting a load level of 0.1 as
it should. So the problem, although not solved, is at least eventually
self-correcting.

Comment 11 Dave Jones 2006-09-17 03:09:23 UTC
[This comment added as part of a mass-update to all open FC4 kernel bugs]

FC4 has now transitioned to the Fedora legacy project, which will continue to
release security related updates for the kernel.  As this bug is not security
related, it is unlikely to be fixed in an update for FC4, and has been migrated
to FC5.

Please retest with Fedora Core 5.

Thank you.


Comment 12 Dave Jones 2006-10-16 19:29:26 UTC
A new kernel update has been released (Version: 2.6.18-1.2200.fc5)
based upon a new upstream kernel release.

Please retest against this new kernel, as a large number of patches
go into each upstream release, possibly including changes that
may address this problem.

This bug has been placed in NEEDINFO state.
Due to the large volume of inactive bugs in bugzilla, if this bug is
still in this state in two weeks time, it will be closed.

Should this bug still be relevant after this period, the reporter
can reopen the bug at any time. Any other users on the Cc: list
of this bug can request that the bug be reopened by adding a
comment to the bug.

In the last few updates, some users upgrading from FC4->FC5
have reported that installing a kernel update has left their
systems unbootable. If you have been affected by this problem
please check you only have one version of device-mapper & lvm2
installed.  See bug 207474 for further details.

If this bug is a problem preventing you from installing the
release this version is filed against, please see bug 169613.

If this bug has been fixed, but you are now experiencing a different
problem, please file a separate bug for the new problem.

Thank you.

Comment 13 Jon Stanley 2008-01-20 04:41:22 UTC
(this is a mass-close to kernel bugs in NEEDINFO state)

As indicated previously there has been no update on the progress of this bug
therefore I am closing it as INSUFFICIENT_DATA. Please re-open if the issue
still occurs for you and I will try to assist in its resolution. Thank you for
taking the time to report the initial bug.

If you believe that this bug was closed in error, please feel free to reopen
this bug.