Hide Forgot
Description of problem: In intel host, there is intel_rapl module loaded by default, if boot a rhel7.1 guest with same kernel via qemu-kvm-rhev, the guest has no such module, and from dmesg: ... intel_rapl: no valid rapl domains found in package 0 ... Version-Release number of selected component (if applicable): kernel-3.10.0-220.el7.x86_64 qemu-kvm-rhev-2.1.2-17.el7.x86_64 How reproducible: 100% Steps to Reproduce: 1.Boot guest with # /usr/libexec/qemu-kvm -cpu SandyBridge ... # /usr/libexec/qemu-kvm -cpu host .... 2. 3. Actual results: from dmesg, get such log: intel_rapl: no valid rapl domains found in package 0 And guest has no intel_rapl module, seams do not support this module Expected results: From the host, there is the module # lsmod |grep intel_rapl intel_rapl 18773 0 # ls /sys/devices/virtual/powercap/intel-rapl enabled intel-rapl:0 power subsystem uevent # lscpu | egrep -i 'model|family|stepping' CPU family: 6 Model: 58 Model name: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz Stepping: 9 It should pass to guest. Additional info: host# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz stepping : 9 microcode : 0x1b cpu MHz : 1885.007 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms bogomips : 6784.76 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: ...
Can reproduce it both with qemu-kvm-1.5.3-84.el7.x86_64 and qemu-kvm-rhev-2.1.2-17.el7.x86_64. And I am not sure if it should be the qemu-kvm/-rhev bug, feel free to change the component if need.
RAPL is "Running Average Power Limit" and is related to power managment. It makes sense that it is not available on a virtual machine. I suggest that this message should not be printed on a virtual machine or this routine, rapl_detect_domains(), should not be called when CPUs are not capable of power management. Eduardo, Is there a bit set for the CPU that should not be set? Why is the code even being executed? Steve? I see that you backported this code to RHEL 7.1...
The RAPL MSRs are non-architectural and the probing is simply based on CPU model/families, so we can't explicitly tell the guest that the feature is unavailable. Skipping the warning on virtual machines makes sense to me, but do we know how/if other modules for non-architectural features avoid this kind of noise when running in VMs?
(In reply to Eduardo Habkost from comment #4) > > Skipping the warning on virtual machines makes sense to me, but do we know > how/if other modules for non-architectural features avoid this kind of noise > when running in VMs? This is not an intel_rapl driver bug. It is yet another case of where a virtualized guest (which is supposed to behave the same as a physical system) is different from a physical system. From a user-experience point of view I can see why you don't want to see this message, however, from a kernel engineer point of view I would argue the kernel is doing the right thing. The kernel is not making a decision based on the system being virtual. If the virt team wants a change here they need to do it in a global manner so that platform features, such as intel_rapl, mce, etc., do not load on virtualized environments. That is something outside of the scope of intel_rapl, IMO. P.
> It is yet another case of where a virtualized guest (which is supposed to > behave the same as a physical system) No, it is yet another case where drivers match generically on CPU family/model/stepping, despite the CPUID instruction existing for a reason. If Intel doesn't want to add precious CPUID bits they should make these messages not errors, because it's perfectly okay for a virt environment to not support power measurement. However... > If the virt team wants a change here they need to do it in a global manner so > that platform features, such as intel_rapl, mce, etc., do not load on > virtualized environments. ... MCE is supported in a virt environment.
(In reply to Paolo Bonzini from comment #6) > > It is yet another case of where a virtualized guest (which is supposed to > > behave the same as a physical system) > > No, it is yet another case where drivers match generically on CPU > family/model/stepping, despite the CPUID instruction existing for a reason. > > If Intel doesn't want to add precious CPUID bits they should make these > messages not errors, because it's perfectly okay for a virt environment to > not support power measurement. > virt is the thing that is obfuscating that functionality which, according to the HW manufacturer, is supposed to exist. That it doesn't exist is correctly flagged as a warning/error. The code expects the functionality to be there for that core. P.
> virt is the thing that is obfuscating that functionality which, according to > the HW manufacturer, is supposed to exist. virt existed for several years before RAPL, so the manufacturer should have known better. There are CPUID bits for stuff like thermal monitoring or SpeedStep, RAPL shouldn't have been any different. If Intel says we should be using core2 f/m/s and just tweak the features, I'm all for it, but I'd be surprised to learn about that 10 years after VT-x was introduced. (I'd also be surprised if nothing broke...). We can ask what Intel suggests on virt-intel-list.
(In reply to Paolo Bonzini from comment #8) > > virt is the thing that is obfuscating that functionality which, according to > > the HW manufacturer, is supposed to exist. > > virt existed for several years before RAPL, so the manufacturer should have > known better. There are CPUID bits for stuff like thermal monitoring or > SpeedStep, RAPL shouldn't have been any different. Your argument basically comes down to this: You're asking me (and upstream) to modify _every_ platform driver with a is_virt() check. > If Intel says we should be using core2 f/m/s and just tweak the features, > I'm all for it, but I'd be surprised to learn about that 10 years after VT-x > was introduced. (I'd also be surprised if nothing broke...). > > We can ask what Intel suggests on virt-intel-list. Please cc me. I'd like to see the discussion. P.
> Your argument basically comes down to this: You're asking me (and upstream) to > modify _every_ platform driver with a is_virt() check. Only because Intel is being sloppy tying features to f/m/s, instead of introducing CPUID bits (or for that matter MSR bits like IA32_PERF_CAPABILITIES, IA32_MCG_CAP, IA32_MTRR_CAP and perhaps others I don't know about). It's also stupid that we have to do patches like commit 64c7569c0655 or commit a97ac35b5d.
(In reply to Paolo Bonzini from comment #10) > > Your argument basically comes down to this: You're asking me (and upstream) to > > modify _every_ platform driver with a is_virt() check. > > Only because Intel is being sloppy tying features to f/m/s, instead of > introducing CPUID bits (or for that matter MSR bits like > IA32_PERF_CAPABILITIES, IA32_MCG_CAP, IA32_MTRR_CAP and perhaps others I > don't know about). > > It's also stupid that we have to do patches like commit 64c7569c0655 or > commit a97ac35b5d. But CPUID bit is for capabilities that may or may not exist on a processor. ex) hyperthreading. In this case they are saying the processor _will_ support this feature. And if you emulate it without the feature then yes, you're going to see a warning. P.
> CPUID bit is for capabilities that may or may not exist on a processor. ex) > hyperthreading No, I disagree. What about SSE? It certainly exists in any PPro+ family/model/stepping. RDRAND? It certainly exists in any Ivy Bridge or newer family/model/stepping. I'm fairly sure they wouldn't take out SSE or RDRAND in newer processors...
(In reply to Paolo Bonzini from comment #12) > > CPUID bit is for capabilities that may or may not exist on a processor. ex) > > hyperthreading > > No, I disagree. What about SSE? It certainly exists in any PPro+ > family/model/stepping. RDRAND? It certainly exists in any Ivy Bridge or > newer family/model/stepping. I'm fairly sure they wouldn't take out SSE or > RDRAND in newer processors... RDRAND (and SSE for that matter) can be disabled on the processor via BIOS. I've had to do it before for RDRAND. The cpuflags still show RDRAND when disabled because the processor supports it. You're mistaking CPUID for functionality. In the case of rapl the processor supports it and IIUC (it has been a while since I've looked at the power spec) cannot be disabled. The MSRs are always there. P.
Patch posted here: http://article.gmane.org/gmane.linux.kernel/2008522
Moving to 7.3
Hi Prarit, China holiday from 9.3 to 9.5. Qian might reply the needinfo after holiday. Best Regards, Juny
Hi, Prarit I tried this scratch build, it does not fix the bug # uname -r 3.10.0-309.el7UNSUPPORTED_1238347.x86_64 # dmesg |egrep -i "intel|rapl" [ 0.140308] smpboot: CPU0: Intel Xeon E312xx (Sandy Bridge) (fam: 06, model: 2a, stepping: 01) [ 0.741864] intel_idle: does not run on family 6 model 42 [ 2.409776] snd_hda_intel 0000:00:04.0: irq 39 for MSI/MSI-X [ 2.413155] intel_rapl: no valid rapl domains found in package 0 [ 2.428077] intel_rapl: no valid rapl domains found in package 0 Qian
Qian, try with "quiet" on the command line. The message is still printed, but it's informative only.
(In reply to Paolo Bonzini from comment #22) > Qian, > > try with "quiet" on the command line. The message is still printed, but > it's informative only. Hi, Paolo The "quiet" is always in the kernel command line during my tests, and the message is still there, just like comment 21. # cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.10.0-309.el7UNSUPPORTED_1238347.x86_64 root=/dev/mapper/rhel_dhcp--9--171-root ro crashkernel=auto rd.lvm.lv=rhel_dhcp-9-171/root rd.lvm.lv=rhel_dhcp-9-171/swap rhgb quiet LANG=en_US.UTF-8
Is this still an issue? P.
Not sure how much this is worth the effort but I'm still seeing this on a RHEL 7.3 Beta guest: [ 1.373153] Error: Driver 'pcspkr' is already registered, aborting... [ 1.456730] intel_rapl: no valid rapl domains found in package 0 Blacklisting the modules is a straightforward solution to get rid of these messages. If that should be the recommended / general solution, please feel free to close BZ or if you wish me to test a kernel build with a code fix just let me know. Thanks.
It's already in the upstream: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=e1a27e87a35cd6bb1087bd8f95a4be5a11e95f76
Created attachment 1241315 [details] RHEL PATCH 1/1
Patch(es) committed on kernel repository and an interim kernel build is undergoing testing
Patch(es) available on kernel-3.10.0-561.el7
Verifying as SanityOnly.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:1842