Bug 795334 - System crashes nearly immediately with Cool'n'Quiet enabled
Summary: System crashes nearly immediately with Cool'n'Quiet enabled
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 16
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-20 09:27 UTC by David Rees
Modified: 2012-09-17 18:39 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-09-17 18:39:43 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description David Rees 2012-02-20 09:27:49 UTC
Description of problem:
After upgrading to Fedora 16, I found it nearly impossible to boot the F16 kernel.  I was getting all sorts of random crashes / OOPS that looked like memory or hardware corruption.  But booting into the Fedora 15 kernel didn't have the same issues and the system has been rock solid otherwise.

Version-Release number of selected component (if applicable):
kernel-3.2.6-3.fc16.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Boot system
2. Watch kernel oops in random place and hang
3. Reboot with reset button and watch it do the same thing again
  
Additional info:
After comparing the F15 (kernel-2.6.42.3-2.fc15.x86_64) and F16 kernel configurations, I noticed that the cpufreq modules were now built into the kernel instead of compiled as modules.  So on a whim, I disabled Cool'n'Quiet and tried booting again - and low and behold I'm now typing this bug report from the F16 kernel and so far seems stable when it used to crash immediately.

But Cool'n'Quiet is important on this machine - it cuts idle power draw down immensely so I really need to figure out a way to get it working again...

Smole profile: http://www.smolts.org/client/show/pub_31fc4756-8171-457a-b658-b0596034f2c8

Let me know what other information I can provide or if there's anything I can test to help track this down.

Comment 1 Dave Jones 2012-02-20 20:39:52 UTC
to start, make sure you're running the newest version of the BIOS.
Some K8 systems have really broken tables.

Comment 2 David Rees 2012-02-20 23:02:41 UTC
Bios is the latest for this motherboard (GA-MA74GM-S2 rev 1.x), F5B.

Comment 3 David Rees 2012-02-29 06:05:26 UTC
OK, made some headway.

I realized that in F14/15, where the cpufreq driver were modules, I had cpuspeed configured to limit CPU frequency to 1.9GHz as I had stability issues then, too, so this isn't a new issue for me.

But F16 got rid of cpuspeed in favor of cpupower and now builds the cpufreq drivers into the kernel.

Is there a way to set the default cpufreq governer or scaling_min_freq from the boot command line to minimize the amount of time the system runs at 800MHz until cpupower runs?

Comment 4 David Rees 2012-03-16 07:42:22 UTC
So frequently on reboots cpupower wouldn't set the minimum frequency quick enough before the system crashed.

I seems that adding processor.max_cstate=1 to the command line works to keep the system from crashing, even without cpupower limiting the minimum cpu frequency.  Setting max_cstate to 2 or higher results in instability.

Comment 5 David Rees 2012-03-16 07:59:53 UTC
I take it back.  After more reboot testing it seems that max_cstate=1 is not enough to keep the system stable during boot all the time.  It just appears to be a bit more reliable than when allowing higher cstates.

It seems to crash most frequently around the time of drm/kms/radeon initialization, most commonly with a blank screen.  If it gets past that without crashing then things seem OK.

Comment 6 Dave Jones 2012-03-22 17:07:08 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 7 Dave Jones 2012-03-22 17:10:17 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 8 Dave Jones 2012-03-22 17:20:33 UTC
[mass update]
kernel-3.3.0-4.fc16 has been pushed to the Fedora 16 stable repository.
Please retest with this update.

Comment 9 David Rees 2012-03-22 17:41:17 UTC
No change with the latest kernel.

After doing more research, it appears that this specific CPU isn't explicitly supported on this version of the motherboard (later revisions of the motherboard do support it).

I have ordered a replacement motherboard which lists support for the CPU.

It still would be nice to be able to pass parameters into the cpufreq or k8-powernow modules to change defaults.  Or at least if these drivers were built as modules instead of built into the kernel as in previous versions of Fedora so that loading of the modules could be more controlled.

Comment 10 Dave Jones 2012-03-22 17:57:53 UTC
fwiw, they will be modules again in f18 when the cpu device autoloading feature lands (just merged in 3.4).

looking at the page at http://www.gigabyte.com/products/product-page.aspx?pid=3323#bios 'FK' seems to be the newest bios (not sure what 'F5B' is)
Is that newer than what you have perhaps ?

Comment 11 David Rees 2012-03-22 18:29:11 UTC
The FK bios is for a newer rev MB - I have revision 1.x MB for which F5B is the latest.

What's weird is that the 1.x MB has support for the Athlon II X2 255 processor with the F4 bios, but no support listed for the Athlon II X2 265 processor.  Support is listed for the rev 2.0 MB, though.

I have no idea why basically the same processor with a slightly higher clock speed would not be supported.  Only thing I can see is that looking at the support list - the 1.x MB does not list support any AM3 CPU with a stepping of C3 or higher while the 2.0 MB does.

Comment 12 Josh Boyer 2012-09-17 18:39:43 UTC
Closing this out due to (admittedly weird) incompatible hardware.


Note You need to log in before you can comment on or make changes to this bug.