Description of problem: When I have powernow (cool n quiet) enabled in the BIOS, my machine will lock up either at boot or after the gdm login. Most often this happens while the gnome desktop is loading, but sometimes during the boot sequence when drivers are being loaded or the network interface is being brought up. I put this under the kernel component, because I assume that the kernel is partly responsible and also because a component needed to be selected. Version-Release number of selected component (if applicable): Fedora 8 x86_64 with all updates as of yesterday (2008/03/03). Kernel version 2.6.23.15-137. How reproducible: 100% Steps to Reproduce: 1. Enable AMD powernow in BIOS 2. Boot Fedora 8 3. If gdm login is presented without lock up, log in. Actual results: Usually the machine locks up, but I have seen it start dumping stack traces to the screen and sometimes it reboots during bootup. Expected results: I would expect powernow to be enabled and the system to be stable. Additional info: I have an AMD Athlon64 X2 5000+, 4 GB ram. When powernow is enabled, I see this message during boot: pnpacpi: exceeded the max number of mem resources: 12 When powernow is disabled, I see a message during boot saying that powernow has been disabled in the BIOS. I have tried the 2.6.24.2 kernel from kernel.org and I get the same result. Let me know if I should provide any other information. Thanks.
I just pointed some folks from AMD at this bug, hopefully they'll have some ideas what could be wrong. What motherboard do you have ?
Asus M2N-SLI Deluxe
Nothing immediately springs to mind as a cause. Does changing PNP_MAX_RESOURCE (include/linux/pnp.h line 17) to something larger than 12 change anything?
I tried changing "PNP_MAX_MEM" from 12 to 15 with the 2.6.24.2 kernel and recompiling. I was not able to compile the 2.6.23.15-137.fc8-x86_64 kernel. It's default for "PNP_MAX_MEM" is 4, so I guess the error I had posted was from when I tried the 2.6.24.2 kernel. [2.6.23.15-137.fc8-x86_64]# make && sudo make modules_install CHK include/linux/version.h CHK include/linux/utsrelease.h make[1]: *** No rule to make target `missing-syscalls'. Stop. make: *** [prepare0] Error 2 When I booted, I no longer got the "pnpacpi: exceeded..." message, but my system still locked up. The first time I got to the gdm login before the lockup and the second time it locked up during the boot process. Just to add more information, this is what is written to dmesg when I have powernow disabled in the BIOS: powernow-k8: Found 1 AMD Athlon(tm) 64 X2 Dual Core Processor 5000+ processors (2 cpu cores) (version 2.00.00) powernow-k8: MP systems not supported by PSB BIOS structure powernow-k8: MP systems not supported by PSB BIOS structure
Can you turn off PowerNow from Linux, such as setting the mode to performance instead of dynamic, and then turn on PowerNow in the BIOS. Then copy dmesg and post to this bug. I just want to see what speeds and voltages the BIOS gives us for P-States.
Created attachment 296959 [details] dmesg with powernow enabled, ondemand governor as default in kernel
Created attachment 296960 [details] output of 'cat /proc/cpuinfo' with ondemand governor in use
I changed the default cpufreq governor from userspace to ondemand and it seems that I only get a lockup or reboot after logging in at the gdm login. If I switch to a virtual console after gdm has loaded and execute 'cpufreq-set -g performance', I can login to my desktop and do not get a lockup for reboot. In fact, I'm currently running in that state while typing this. I am uploading the output of dmesg and the output of /proc/cpuinfo before I ran cpufreq-set, so it was using the ondemand governor at that point.
I should have specified that the above comment was with the 2.6.24.2 kernel. If there is a way I can get around the compile issue with the fedora 8 kernel, I will gladly try that kernel with the changes instead.
Looks like you have a newer RevG2 processor, which I am not seeing any PowerNow erratas for this product. Did you check that you have the latest BIOS? The frequencies the BIOS gave us for P-States look good but to see that the voltages line up I would need the OPN number printed on the CPU (which I admit is quite tedious to retrieve so please check if there is a newer BIOS instead). Are you able to boot into runlevel 3 (edit /etc/inittab or add 3 to the kernel parameters at boot) and see if you reproduce the lock and get an error message from the console (best to get a serial console set up)?
I am using the latest stable BIOS for my motherboard. There is a newer beta BIOS available though. Would the OPN number be printed on any of the packaging or materials included with the processor or just on the processor itself? I did try switching to runlevel 3 and had more stability with the ondemand governor still the current default. I didn't test it for too long, but it was definitely stable for longer than runlevel 5 after logging in. Should I switch back to the userspace default (2.6.24.2 default) or leave it at ondemand? Should I create some load to see if that is what is causing the lockup? It seems like the lockup/reboot coincides with an increase in load or when the frequency has to be scaled up. I will leave my system in runlevel 3 for a longer period to see if I can reproduce the problem.
I left my system idle in runlevel 3 for half an hour or so with no problems a few different times. Once I created load, which I'm assuming would cause the cpu frequency to be dynamically raised, the machine rebooted or locked up shortly after. To create load, I recompiled the kernel with 'make -j3'. I found my OPN on my Certificate of Authenticity. It is ADO5000DSWOF.
Thanks for the OPN, this looks like an oddball (or at least I am not seeing it internally). Please check that these specs look right http://products.amd.com/en-us/DesktopCPUDetail.aspx?id=41 I say it is odd because I am not finding it in the Thermal and Power Guide to verify that the BIOS is doing the right thing.
Yes those specs match my processor. Should I try the beta BIOS that is available or hold off on that for now?
Please try the beta BIOS. Additionally if you could test Windows and tell us if it also has any instability problems. Thanks.
It sounds like your CPU might be experiencing under voltage as it ramps up in frequency. If the beta BIOS doesn't resolve issues: You might want to try setting the default governor to "powersave" and then enabling the userspace governor manually: `echo "userspace" > /sys/devices/system/cpu/cpu0/scaling_governor; echo "userspace" > /sys/devices/system/cpu/cpu1/scaling_governor` Then you can manually set the frequencies by running `echo -n $NEW_FREQ > /sys/devices/system/cpu/cpu0/scaling_setspeed` where $NEW_FREQ is desired frequency in KHz (ie, 1800000 for 1.8GHz). It would be interesting to see at what frequency you begin to see instability. I'm suspecting 2 GHz.
Joachim, I had to do a double take to make sure that you had just suggested in a Fedora Bugzilla bug that I use Windows to test stability. Thank you for the suggestions folks. I will give them a try.
Jeff, Just trying to figure out if we (the OS) is doing something wrong or the BIOS.
Mark, things check out with the voltages. Looking at parts that match this description (such as AD)5000IAA5DD) then the voltage VIDs that come out of dmesg (which translate to 1.35, 1.3, 1.25, 1.2, 1.15 an 1.1 Volts) match up with the Thermal and Power Data Sheet.
The beta BIOS looks to have helped. I first tested the 2.6.24.2 kernel and the system was stable and the system has been stable on the 2.6.23.15-137.fc8-x86_64 for 6+ hours.
I spoke too soon. It seems that after I loaded the beta BIOS, I set all of my previous BIOS settings except for the memory speeds. Once I set my memory to the speed and timings it is rated for, the system is unstable with powernow enabled. I have been running memtest86 on the machine for a few hours now with no issues and will continue to run it for a while longer. Does memtest86 use the memory timings from the BIOS or use different ones based on a check of the memory? I just want to make sure it is using the recommended speed and timings. I will also try gradually adjusting the frequencies to see if there is a point where the system becomes unstable.
memtest86 ran a number of passes (I think it was 15+) with no errors reported. In kernel 2.6.24.2 I could not select the powersave governor as the default, unless I am doing something wrong. I tried setting conservative to the default, but could not boot even to runlevel 3 without issues.
(In reply to comment #21) > I spoke too soon. It seems that after I loaded the beta BIOS, I set all of my > previous BIOS settings except for the memory speeds. If you use the default BIOS memory settings then is the system stable with PowerNow?
(In reply to comment #23) > (In reply to comment #21) > > I spoke too soon. It seems that after I loaded the beta BIOS, I set all of my > > previous BIOS settings except for the memory speeds. > > If you use the default BIOS memory settings then is the system stable with PowerNow? The system is stable with PowerNow when the memory settings are set to auto, but not when I set the memory to the manufacturer recommended settings for latency and clock speed.
I'm sorry I am going to have to mark this as NOTABUG. The manufacturer puts the speed and latency information the memory DIMM is rated at in a piece of ROM called SPD which the BIOS reads out and uses to set the memory speed. You are in an unsupported land when you tweak the memory settings and get an instable system.
I agree with you that the SPD should have the correct values. However, it seems from the manufacturer's PDF for this memory that the fastest tested latencies were not what was programmed in the SPD. I did not know that until further research at the manufacturer's website after your update today. It looks like the reason my system was not stable was because the voltage also need to be changed. I'm not sure why they didn't just program those values into the SPD. Here is a pdf for the specific memory I have: http://www.corsair.com/_datasheets/TWIN2X2048-6400C4.pdf My system has been stable so far after changing the voltage. Thank you for looking into this. I just wanted to update to say that I was not using unsupported settings or tweaks.