Red Hat Bugzilla – Bug 739159
powernow-k8 transitions fail in xen dom0
Last modified: 2014-01-21 18:04:53 EST
Description of problem:
I performed a minimal install of Fedora 16 Beta RC1, added Xen and rebooted the system in Xen Domain 0. Following the install, I saw a constant steam of errors on the console. This made the console unusable.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Perform minimal install on Fedora 16 Beta RC1
powernow-k8: fid trans failed, fid 0x2, curr 0x0
powernow-k8: transmission frequency failed
Installing the cpupowerutils package and running the cpupower service stopped the errors.
I don't see this package in comps. Seems like it should be added to either Core or Base.
The kernel shouldn't be so loud in the absence of this package.
There's no such thing as a cpupowerutils package. Did you mean cpufrequtils?
(In reply to comment #3)
> There's no such thing as a cpupowerutils package. Did you mean cpufrequtils?
cpupowerutils and cpufrequtils are both replaced by kernel-tools in f16 and newer. The kernel-tools package has Provides for both of the other packages.
editing summary to reflect the the actual bug.
stopping the cpupower service is just hiding the real problem.
in f16, we switched powernow & ondemand to be built-in by default so we can do without some of the messy userspace setup. Unfortunately this means we enable it by default on xen, because we missed the one thing in the old cpuspeed init script that bailed out early if we're virtualised.
I'll come up with a kernel fix for this.
The kernel's doing nothing wrong here. This is a Xen bug. Either accessing this hardware in dom0 should work (in which case you wouldn't get an error), or Xen should be masking off the cpuid capability bits to indicate to the driver that it won't work.
I'm a little confused here, is this an issue to be fixed in the kernel or does it need to be reported upstream? (a quick search didn't find anything in xensource's bugzilla)
(In reply to comment #7)
> I'm a little confused here, is this an issue to be fixed in the kernel or does
> it need to be reported upstream? (a quick search didn't find anything in
> xensource's bugzilla)
NVM, I should have checked the history before making a comment. Reporting it upstream.
Filed upstream in the xensource bugzilla:
Thanks for posting it in the Xen BZ. The issue is .. well, it is a long explanation, so skip to the end for summary if you don't want to read technical jargon.
On AMD with Xen, the cpuidle is set to use the halt one (this is b/c the halt ends up doing a yield hypercall) - look in setup.c - and the hypervisor does the appropiate halt operation (MWAIT, halt, etc, or schedules another guest on the CPU). Anyhow, to not have the cpuidle trying to activate, the "boot_option_idle_override" is set. Therefore, the ACPI _PSS driver (processor.ko) ends up bailing out, b/c of that parameter. As such the "older" AMD pstate driver is invoked (powernow-k8), and the older driver attempts to use ACPI _PSD - but only if in UP mode, or it attempts to use the voltage tables - which are k8 or earlier. To detect that, it use the MSR (sadly not CPUID values), which Xen traps and returns 00, which the powernow-k8 driver interprets as "buggy hardware - can't use". Which is exactly what you are seeing.
I believe (and sadly I don't have the hardware to check this - but I think I saw the somebody using it) if you were running on K8 hardware - it ought to work.
Solution: Have the ACPI processor driver cooperate with Xen. Patches are in the queue for it (if you are really interested look in oss.oracle.com/kwilk/xen.git #devel/acpi-cpufreq.v2 - but they are not yet upstream-able material. Actually, they are quite ugly).
The "easy" option for right now would be to do what Dave suggest until the upstream patches are ready and reviewed.
Just to add, per Konrad's comment above.
This issue also occurs on K8 hardware (Athlon II 4850e) with powernow enabled in the BIOS (aka Cool'n'quiet).
Disabling Cool'n'quiet removes the messages.
I just performed the standard Desktop install of the F16 Beta, installed Xen, rebuilt the grub menu, and rebooted to dom0. Syslog is full of the same message per this bug.
It would be great to see Dave's solution (see comment 5 above); an interim kernel-patch solution to this bug - as it currently stands dom0 functionality is largely unuseable.
So the patch to upload the power management data is in the upstream kernel (3.4-rc0): http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=d4c6fa73fe984e504d52f3d6bba291fd76fe49f7
and there are some other ones forthcoming so that the driver - xen-acpi-processor will be the only one loading when booting under Xen. kernel compile option is CONFIG_XEN_ACPI_PROCESSOR and by default it ought to be 'm'.
MA Young: this means that this patch (to load said driver) will have to be back-ported in Xen 4.1:http://lists.xen.org/archives/html/xen-devel/2012-03/msg02063.html
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
With the most recent kernel and xen-4.1.2-7.fc16 or later this should be fixed.