Bug 477247
Summary: | KVM initialization function vmx_check_processor_compat() needs to print more diagnostics when CPUs not compatible | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Nathan Watson <nfwatson> | ||||
Component: | kvm | Assignee: | Glauber Costa <gcosta> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 10 | CC: | berrange, clalance, gcosta, markmc, nfwatson, virt-maint | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2009-12-18 07:20:29 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Nathan Watson
2008-12-19 20:46:41 UTC
By the way, Intel-VT is enabled in the BIOS ... the BIOS settings are identical now with 2 CPUs as they were when there was only 1 CPU. Glauber: tenha um bom dia Created attachment 327500 [details]
output of "cat /proc/cpuinfo", demonstrates all 8 cores (2CPUx4core) have vmx and otherwise should be compatible
apparently I forgot to attach the output of "cat /proc/cpuinfo" ...
here it is
I pulled the source for kvm-74-6.fc10.src.rpm (the version I'm using). In tracing the source code that emits the kernel/kernel-module-level error messages (NOTE: I am not a kernel hacker, I know minimal stuff about this), I see something that's confusing: * the error message "kvm: CPU %d feature inconsistency!" is generated in file .../x86/vmx.c in function vmx_check_processor_compat() * the function vmx_check_processor_compat() compares a LOCAL in-function structure "struct vmcs_config vmcs_conf" against a GLOBAL structure "static struct vmcs_config" * the comparison is done with a straightforward (0 == memcmp(..., ........)) * the GLOBAL "static struct vmcs_config" is initialized once by a call to hardware_setup() in the same file, probably in context of one of the two physical CPUs, probably long before function vmx_check_processor_compat() is called for each CPU (QUESTION: is it possible this function is called in context of each core and we only see the info for 2 of the 8 available cores before KVM gives up?) * both the hardware_setup() and vmx_check_processor_compat() use the same function setup_vmcs_config() to initialize their respective GLOBAL and/or LOCAL copies of the function * presumably the GLOBAL copy and ONE of the LOCAL copies of the "vmcs_config" structure will have been called in the context of the same CPU * ... AND YET, THE ERROR OCCURS NOT ONLY FOR THE NON-IDENTICAL CPUs, BUT THE ERROR SHOWS UP FOR THE SAME CPU ALSO This leads me to suspect that the function setup_vmcs_config() is not initializing the structure properly in a way such that after relevant details are filled in, a (0 == memcmp(...)) will lead to result '0'. ... to further elaborate on my last comment, the KVM module initialization appears to do the following (with a little speculation): * initialize GLOBAL_vmcs_config in context of CPU #0 in function hardware_setup() * ... later, for CPU #0: * initialize LOCAL_vmcs_config in context of CPU #0 * compare GLOBAL_vmcs_config with LOCAL_vmcs_config * leads to ERROR, "kvm: CPU 0 feature inconsistency!" * ... and then (or else in parallel), for CPU #1 * initialize LOCAL_vmcs_config in context of CPU #1 * compare GLOBAL_vmcs_config with LOCAL_vmcs_config * leads to ERROR, "kvm: CPU 1 feature inconsistency!" I can see a situation where the second error message would come about, but I have a hard time seeing why the first one, about CPU #0, printing out ... they both happen in context of CPU #0 (or one of the CPUs, not sure which). I'd expect setup_vmcs_config() to generate identical structures for both of them. i don't know how to build/deploy kernel modules, now I guess is time to learn. i'm not sure whether all fields that make it into the "struct vmcs_config" are initialized correctly. even if they are, i'm also not sure what ALIGNMENT CONSIDERATIONS are on x86_64 platform ... since there's a (0 == memcmp(...)) going on it could be that some gaps exist in the structure between elements and that even though for all fields in ... static struct vmcs_config { int size; int order; u32 revision_id; u32 pin_based_exec_ctrl; u32 cpu_based_exec_ctrl; u32 cpu_based_2nd_exec_ctrl; u32 vmexit_ctrl; u32 vmentry_ctrl; } vmcs_config; ... a comparison would yield equality, the overall (0 == memcmp()) MIGHT FAIL!!! Just some suggestions for whoever's looking into this. I'll keep probing. Of course, it could be that the two separate "Intel Xeon quad-core E5310 1.6GHz" chips are incompatible ... if that's so I should just shoot myself. Well, I guess I need to shoot myself. I patched the in-kernel KVM code to list values for all fields of the "global" and "local" vmcs_config structures that failed equality within function vmx_check_processor_compat(). The result: kernel: kvm: CPU 0 feature inconsistency! _MODIFIED_BY_NFW_TO_SHOW_FIELDS_ kernel: kvm: kvm_NFW_vmcs_config_version: GLOBAL kernel: kvm: size = 2048 kernel: kvm: order = 0 kernel: kvm: revision_id = 11 kernel: kvm: pin_based_exec_ctrl = 0x3f kernel: kvm: cpu_based_exec_ctrl = 0x96a1e1fa kernel: kvm: cpu_based_2nd_exec_ctrl = 0x1 kernel: kvm: vmexit_ctrl = 0x36fff kernel: kvm: vmentry_ctrl = 0x11ff kernel: kvm: kvm_NFW_vmcs_config_version: LOCAL_ kernel: kvm: size = 1024 kernel: kvm: order = 0 kernel: kvm: revision_id = 7 kernel: kvm: pin_based_exec_ctrl = 0x1f kernel: kvm: cpu_based_exec_ctrl = 0x16a1e1fa kernel: kvm: cpu_based_2nd_exec_ctrl = 0x0 kernel: kvm: vmexit_ctrl = 0x36fff kernel: kvm: vmentry_ctrl = 0x11ff Looks like my two Intel Xeon Quad-Core E5310 processors don't match up. ... from http://www.intel.com/support/motherboards/server/sb/CS-022346.htm#5300 it looks like there are SEVERAL VERSIONS of E5310, including: E5310 SLACB 1.60 GHz 1066 MHz 8 MB B3 80 E5310 SL9XR 1.60 GHz 1066 MHz 8 MB B3 80 E5310 SLAEM 1.60 GHz 1066 MHz 8 MB G0 80 Looks like I purchased an "older" processor a long time ago, either the "SLACB" or "SL9XR", and paired it with a newer "SLAEM" processor. Yeah! I put an ox and a donkey together in the yoke, it just don't work. The complaint from the KVM in-kernel initialization looks like it's valid. It would be nice when there's a "feature inconsistency" between process if the KVM in-kernel initialization printed out the full field content from the two "" compared in function "vmx_check_processor_compat()". I'm changing summary from "2 Identical CPUs not recognized as Intel-VT & KVM-compatible" to "KVM initization function vmx_check_processor_compat() needs to print more diagnostics when CPUs not compatible" and downgrading the severity. Something along the lines of what's in comment #6 would be nice. Would you mind submitting a patch for it upstream? If it's debated and accepted, we'd be more than happy to include it in upcoming versions of Fedora. This message is a reminder that Fedora 10 is nearing its end of life. Approximately 30 (thirty) days from now Fedora will stop maintaining and issuing updates for Fedora 10. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '10'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 10's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 10 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora please change the 'version' of this bug to the applicable version. If you are unable to change the version, please add a comment here and someone will do it for you. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. The process we are following is described here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |