Bug 739159

Summary: powernow-k8 transitions fail in xen dom0
Product: [Fedora] Fedora Reporter: W. Michael Petullo <mike>
Component: xenAssignee: Michael Young <m.a.young>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: berrange, gansalmon, hilton.day, itamar, jfeeney, jforbes, joe, jonathan, kernel-maint, ketuzsezr, kraxel, madhu.chinakonda, martin, m.a.young, notting, rtc, tflink, vanmeeuwen+fedora, virt-maint, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-07 21:27:04 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description W. Michael Petullo 2011-09-16 15:54:56 UTC
Description of problem:
I performed a minimal install of Fedora 16 Beta RC1, added Xen and rebooted the system in Xen Domain 0. Following the install, I saw a constant steam of errors on the console. This made the console unusable.

Version-Release number of selected component (if applicable):


How reproducible:
Every time

Steps to Reproduce:
1. Perform minimal install on Fedora 16 Beta RC1
2. Boot
  
Actual results:
powernow-k8: fid trans failed, fid 0x2, curr 0x0
powernow-k8: transmission frequency failed

Expected results:


Additional info:
Installing the cpupowerutils package and running the cpupower service stopped the errors.

Comment 1 Chris Lumens 2011-09-16 16:20:02 UTC
I don't see this package in comps.  Seems like it should be added to either Core or Base.

Comment 2 Bill Nottingham 2011-09-19 20:02:55 UTC
The kernel shouldn't be so loud in the absence of this package.

Comment 3 Chuck Ebbert 2011-09-20 11:38:48 UTC
There's no such thing as a cpupowerutils package. Did you mean cpufrequtils?

Comment 4 Josh Boyer 2011-09-20 12:43:23 UTC
(In reply to comment #3)
> There's no such thing as a cpupowerutils package. Did you mean cpufrequtils?

cpupowerutils and cpufrequtils are both replaced by kernel-tools in f16 and newer.  The kernel-tools package has Provides for both of the other packages.

Comment 5 Dave Jones 2011-09-20 15:46:09 UTC
editing summary to reflect the the actual bug.

stopping the cpupower service is just hiding the real problem.

in f16, we switched powernow & ondemand to be built-in by default so we can do without some of the messy userspace setup. Unfortunately this means we enable it by default on xen, because we missed the one thing in the old cpuspeed init script that bailed out early if we're virtualised.

I'll come up with a kernel fix for this.

Comment 6 Matthew Garrett 2011-09-20 15:55:06 UTC
The kernel's doing nothing wrong here. This is a Xen bug. Either accessing this hardware in dom0 should work (in which case you wouldn't get an error), or Xen should be masking off the cpuid capability bits to indicate to the driver that it won't work.

Comment 7 Tim Flink 2011-10-07 13:35:48 UTC
I'm a little confused here, is this an issue to be fixed in the kernel or does it need to be reported upstream? (a quick search didn't find anything in xensource's bugzilla)

Comment 8 Tim Flink 2011-10-07 13:40:03 UTC
(In reply to comment #7)
> I'm a little confused here, is this an issue to be fixed in the kernel or does
> it need to be reported upstream? (a quick search didn't find anything in
> xensource's bugzilla)

NVM, I should have checked the history before making a comment. Reporting it upstream.

Comment 9 Tim Flink 2011-10-07 13:52:01 UTC
Filed upstream in the xensource bugzilla:

http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1789

Comment 10 Konrad Rzeszutek Wilk 2011-10-07 15:18:57 UTC
Tim,

Thanks for posting it in the Xen BZ. The issue is .. well, it is a long explanation, so skip to the end for summary if you don't want to read technical jargon.

On AMD with Xen, the cpuidle is set to use the halt one (this is b/c the halt ends up doing a yield hypercall) - look in setup.c - and the hypervisor does the appropiate halt operation (MWAIT, halt, etc, or schedules another guest on the CPU). Anyhow, to not have the cpuidle trying to activate, the "boot_option_idle_override" is set. Therefore, the ACPI _PSS driver (processor.ko) ends up bailing out, b/c of that parameter. As such the "older" AMD pstate driver is invoked (powernow-k8), and the older driver attempts to use ACPI _PSD - but only if in UP mode, or it attempts to use the voltage tables - which are k8 or earlier. To detect that, it use the MSR (sadly not CPUID values), which Xen traps and returns 00, which the powernow-k8 driver interprets as "buggy hardware - can't use". Which is exactly what you are seeing.

I believe (and sadly I don't have the hardware to check this - but I think I saw the somebody using it) if you were running on K8 hardware - it ought to work.

Solution: Have the ACPI processor driver cooperate with Xen. Patches are in the queue for it (if you are really interested look in oss.oracle.com/kwilk/xen.git #devel/acpi-cpufreq.v2 - but they are not yet upstream-able material. Actually, they are quite ugly).

Other solution:
The "easy" option for right now would be to do what Dave suggest until the upstream patches are ready and reviewed.

Comment 11 Hilton Day 2011-10-09 05:55:39 UTC
Just to add, per Konrad's comment above.

This issue also occurs on K8 hardware (Athlon II 4850e) with powernow enabled in the BIOS (aka Cool'n'quiet).

Disabling Cool'n'quiet removes the messages.

I just performed the standard Desktop install of the F16 Beta, installed Xen, rebuilt the grub menu, and rebooted to dom0.  Syslog is full of the same message per this bug.

It would be great to see Dave's solution (see comment 5 above); an interim kernel-patch solution to this bug - as it currently stands dom0 functionality is largely unuseable.

Comment 12 Konrad Rzeszutek Wilk 2012-03-23 21:10:33 UTC
So the patch to upload the power management data is in the upstream kernel (3.4-rc0): http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=d4c6fa73fe984e504d52f3d6bba291fd76fe49f7

and there are some other ones forthcoming so that the driver - xen-acpi-processor will be the only one loading when booting under Xen. kernel compile option is CONFIG_XEN_ACPI_PROCESSOR and by default it ought to be 'm'.

MA Young: this means that this patch (to load said driver) will have to be back-ported in Xen 4.1:http://lists.xen.org/archives/html/xen-devel/2012-03/msg02063.html

Comment 13 Fedora Admin XMLRPC Client 2012-05-15 19:37:18 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 14 Michael Young 2012-08-07 21:27:04 UTC
With the most recent kernel and xen-4.1.2-7.fc16 or later this should be fixed.