Bug 436003 - Enabling powernow in BIOS causes Fedora 8 to lock up
Enabling powernow in BIOS causes Fedora 8 to lock up
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
8
x86_64 Linux
low Severity low
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-04 15:48 EST by Jeff Williams
Modified: 2008-03-10 19:10 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-03-10 15:14:01 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmesg with powernow enabled, ondemand governor as default in kernel (32.49 KB, text/plain)
2008-03-05 18:15 EST, Jeff Williams
no flags Details
output of 'cat /proc/cpuinfo' with ondemand governor in use (1.40 KB, text/plain)
2008-03-05 18:16 EST, Jeff Williams
no flags Details

  None (edit)
Description Jeff Williams 2008-03-04 15:48:05 EST
Description of problem:

When I have powernow (cool n quiet) enabled in the BIOS, my machine will lock up
either at boot or after the gdm login. Most often this happens while the gnome
desktop is loading, but sometimes during the boot sequence when drivers are
being loaded or the network interface is being brought up. I put this under the
kernel component, because I assume that the kernel is partly responsible and
also because a component needed to be selected.

Version-Release number of selected component (if applicable):

Fedora 8 x86_64 with all updates as of yesterday (2008/03/03). Kernel version
2.6.23.15-137.

How reproducible:

100%

Steps to Reproduce:
1. Enable AMD powernow in BIOS
2. Boot Fedora 8
3. If gdm login is presented without lock up, log in.
  
Actual results:

Usually the machine locks up, but I have seen it start dumping stack traces to
the screen and sometimes it reboots during bootup.

Expected results:

I would expect powernow to be enabled and the system to be stable.

Additional info:

I have an AMD Athlon64 X2 5000+, 4 GB ram. When powernow is enabled, I see this
message during boot:

pnpacpi: exceeded the max number of mem resources: 12

When powernow is disabled, I see a message during boot saying that powernow has
been disabled in the BIOS.

I have tried the 2.6.24.2 kernel from kernel.org and I get the same result. Let
me know if I should provide any other information. Thanks.
Comment 1 Dave Jones 2008-03-04 16:13:35 EST
I just pointed some folks from AMD at this bug, hopefully they'll have some
ideas what could be wrong.  What motherboard do you have ?
Comment 2 Jeff Williams 2008-03-04 16:19:32 EST
Asus M2N-SLI Deluxe
Comment 3 Mark Langsdorf 2008-03-04 16:26:19 EST
Nothing immediately springs to mind as a cause.

Does changing PNP_MAX_RESOURCE (include/linux/pnp.h line 17) to something larger
than 12 change anything?
Comment 4 Jeff Williams 2008-03-04 18:19:10 EST
I tried changing "PNP_MAX_MEM" from 12 to 15 with the 2.6.24.2 kernel and
recompiling. 

I was not able to compile the 2.6.23.15-137.fc8-x86_64 kernel. It's default for
"PNP_MAX_MEM" is 4, so I guess the error I had posted was from when I tried the
2.6.24.2 kernel.

[2.6.23.15-137.fc8-x86_64]# make && sudo make modules_install
  CHK     include/linux/version.h
  CHK     include/linux/utsrelease.h
make[1]: *** No rule to make target `missing-syscalls'.  Stop.
make: *** [prepare0] Error 2


When I booted, I no longer got the "pnpacpi: exceeded..." message, but my system
still locked up. The first time I got to the gdm login before the lockup and the
second time it locked up during the boot process. 

Just to add more information, this is what is written to dmesg when I have
powernow disabled in the BIOS:

powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+ processors
(2 cpu cores) (version 2.00.00)
powernow-k8: MP systems not supported by PSB BIOS structure
powernow-k8: MP systems not supported by PSB BIOS structure
Comment 5 Joachim Deguara 2008-03-05 14:08:08 EST
Can you turn off PowerNow from Linux, such as setting the mode to performance
instead of dynamic, and then turn on PowerNow in the BIOS.  Then copy dmesg and
post to this bug.  I just want to see what speeds and voltages the BIOS gives us
for P-States.
Comment 6 Jeff Williams 2008-03-05 18:15:31 EST
Created attachment 296959 [details]
dmesg with powernow enabled, ondemand governor as default in kernel
Comment 7 Jeff Williams 2008-03-05 18:16:20 EST
Created attachment 296960 [details]
output of 'cat /proc/cpuinfo' with ondemand governor in use
Comment 8 Jeff Williams 2008-03-05 18:16:49 EST
I changed the default cpufreq governor from userspace to ondemand and it seems
that I only get a lockup or reboot after logging in at the gdm login. If I
switch to a virtual console after gdm has loaded and execute 'cpufreq-set -g
performance', I can login to my desktop and do not get a lockup for reboot. In
fact, I'm currently running in that state while typing this. I am uploading the
output of dmesg and the output of /proc/cpuinfo before I ran cpufreq-set, so it
was using the ondemand governor at that point.
Comment 9 Jeff Williams 2008-03-05 18:19:09 EST
I should have specified that the above comment was with the 2.6.24.2 kernel. If
there is a way I can get around the compile issue with the fedora 8 kernel, I
will gladly try that kernel with the changes instead.
Comment 10 Joachim Deguara 2008-03-05 18:51:25 EST
Looks like you have a newer RevG2 processor, which I am not seeing any PowerNow
erratas for this product.  Did you check that you have the latest BIOS?  The
frequencies the BIOS gave us for P-States look good but to see that the voltages
line up I would need the OPN number printed on the CPU (which I admit is quite
tedious to retrieve so please check if there is a newer BIOS instead).
Are you able to boot into runlevel 3 (edit /etc/inittab or add 3 to the kernel
parameters at boot) and see if you reproduce the lock and get an error message
from the console (best to get a serial console set up)?
Comment 11 Jeff Williams 2008-03-05 20:41:14 EST
I am using the latest stable BIOS for my motherboard. There is a newer beta BIOS
available though. Would the OPN number be printed on any of the packaging or
materials included with the processor or just on the processor itself? I did try
switching to runlevel 3 and had more stability with the ondemand governor still
the current default. I didn't test it for too long, but it was definitely stable
for longer than runlevel 5 after logging in. Should I switch back to the
userspace default (2.6.24.2 default) or leave it at ondemand? Should I create
some load to see if that is what is causing the lockup? It seems like the
lockup/reboot coincides with an increase in load or when the frequency has to be
scaled up. I will leave my system in runlevel 3 for a longer period to see if I
can reproduce the problem.
Comment 12 Jeff Williams 2008-03-05 22:41:51 EST
I left my system idle in runlevel 3 for half an hour or so with no problems a
few different times. Once I created load, which I'm assuming would cause the cpu
frequency to be dynamically raised, the machine rebooted or locked up shortly
after. To create load, I recompiled the kernel with 'make -j3'.

I found my OPN on my Certificate of Authenticity. It is ADO5000DSWOF.
Comment 13 Joachim Deguara 2008-03-07 13:10:06 EST
Thanks for the OPN, this looks like an oddball (or at least I am not seeing it
internally).  Please check that these specs look right
http://products.amd.com/en-us/DesktopCPUDetail.aspx?id=41

I say it is odd because I am not finding it in the Thermal and Power Guide to
verify that the BIOS is doing the right thing.
Comment 14 Jeff Williams 2008-03-07 16:13:36 EST
Yes those specs match my processor. Should I try the beta BIOS that is available
or hold off on that for now?
Comment 15 Joachim Deguara 2008-03-07 16:34:58 EST
Please try the beta BIOS.  Additionally if you could test Windows and tell us if
it also has any instability problems.  Thanks.
Comment 16 Mark Langsdorf 2008-03-07 16:41:52 EST
It sounds like your CPU might be experiencing under voltage as it ramps up in
frequency.  If the beta BIOS doesn't resolve issues:

You might want to try setting the default governor to "powersave" and then
enabling the userspace governor manually:
`echo "userspace" > /sys/devices/system/cpu/cpu0/scaling_governor;
echo "userspace" > /sys/devices/system/cpu/cpu1/scaling_governor`

Then you can manually set the frequencies by running
`echo -n $NEW_FREQ > /sys/devices/system/cpu/cpu0/scaling_setspeed`
   where $NEW_FREQ is desired frequency in KHz (ie, 1800000 for 1.8GHz).

It would be interesting to see at what frequency you begin to see
instability.  I'm suspecting 2 GHz.
Comment 17 Jeff Williams 2008-03-07 16:58:15 EST
Joachim,

I had to do a double take to make sure that you had just suggested in a Fedora
Bugzilla bug that I use Windows to test stability.

Thank you for the suggestions folks. I will give them a try.
Comment 18 Joachim Deguara 2008-03-07 17:08:32 EST
Jeff,
 Just trying to figure out if we (the OS) is doing something wrong or the BIOS.
Comment 19 Joachim Deguara 2008-03-07 18:35:06 EST
Mark, things check out with the voltages.  Looking at parts that match this
description (such as AD)5000IAA5DD) then the voltage VIDs that come out of dmesg
(which translate to 1.35, 1.3, 1.25, 1.2, 1.15 an 1.1 Volts) match up with the
Thermal and Power Data Sheet.
Comment 20 Jeff Williams 2008-03-08 01:38:40 EST
The beta BIOS looks to have helped. I first tested the 2.6.24.2 kernel and the
system was stable and the system has been stable on the 2.6.23.15-137.fc8-x86_64
for 6+ hours.
Comment 21 Jeff Williams 2008-03-08 10:32:01 EST
I spoke too soon. It seems that after I loaded the beta BIOS, I set all of my
previous BIOS settings except for the memory speeds. Once I set my memory to the
speed and timings it is rated for, the system is unstable with powernow enabled.
I have been running memtest86 on the machine for a few hours now with no issues
and will continue to run it for a while longer. Does memtest86 use the memory
timings from the BIOS or use different ones based on a check of the memory? I
just want to make sure it is using the recommended speed and timings. I will
also try gradually adjusting the frequencies to see if there is a point where
the system becomes unstable. 
Comment 22 Jeff Williams 2008-03-10 13:35:20 EDT
memtest86 ran a number of passes (I think it was 15+) with no errors reported. 

In kernel 2.6.24.2 I could not select the powersave governor as the default,
unless I am doing something wrong. I tried setting conservative to the default,
but could not boot even to runlevel 3 without issues.
Comment 23 Joachim Deguara 2008-03-10 13:43:10 EDT
(In reply to comment #21)
> I spoke too soon. It seems that after I loaded the beta BIOS, I set all of my
> previous BIOS settings except for the memory speeds. 

If you use the default BIOS memory settings then is the system stable with PowerNow?
Comment 24 Jeff Williams 2008-03-10 15:03:41 EDT
(In reply to comment #23)
> (In reply to comment #21)
> > I spoke too soon. It seems that after I loaded the beta BIOS, I set all of my
> > previous BIOS settings except for the memory speeds. 
> 
> If you use the default BIOS memory settings then is the system stable with
PowerNow?

The system is stable with PowerNow when the memory settings are set to auto, but
not when I set the memory to the manufacturer recommended settings for latency
and clock speed.
Comment 25 Joachim Deguara 2008-03-10 15:14:01 EDT
I'm sorry I am going to have to mark this as NOTABUG.  The manufacturer puts the
speed and latency information the memory DIMM is rated at in a piece of ROM
called SPD which the BIOS reads out and uses to set the memory speed.  You are
in an unsupported land when you tweak the memory settings and get an instable
system.
Comment 26 Jeff Williams 2008-03-10 19:10:12 EDT
I agree with you that the SPD should have the correct values. However, it seems
from the manufacturer's PDF for this memory that the fastest tested latencies
were not what was programmed in the SPD. I did not know that until further
research at the manufacturer's website after your update today. It looks like
the reason my system was not stable was because the voltage also need to be
changed. I'm not sure why they didn't just program those values into the SPD.
Here is a pdf for the specific memory I have:

http://www.corsair.com/_datasheets/TWIN2X2048-6400C4.pdf

My system has been stable so far after changing the voltage. Thank you for
looking into this. I just wanted to update to say that I was not using
unsupported settings or tweaks.

Note You need to log in before you can comment on or make changes to this bug.