Bug 676197

Summary: libvirt incorrectly identifies Nehalem if Execute Disable is not enabled in BIOS
Product: Red Hat Enterprise Linux 6 Reporter: Mark Wagner <mwagner>
Component: libvirtAssignee: Scott Radvan <sradvan>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.1CC: berrange, eblake, jcooper, jdenemar, john.cooper, mwagner, rlandman, tburke, xen-maint, yoyzhang
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 676631 (view as bug list) Environment:
Last Closed: 2011-02-18 16:30:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 676631    

Description Mark Wagner 2011-02-09 03:29:38 UTC
Description of problem:
When using libvirt based virsh or virt-manager a Nehalem based system is incorrectly identified as a Pentium 3.  

Version-Release number of selected component (if applicable):
0.8.7-5.el6.1dan

How reproducible:
Every time

Steps to Reproduce:
1. Enter system BIOS and disable the "Execute Disable" setting (typically under processor)
2.reboot to RHEL and run virsh capabilities
3.examine output
4) toggle flag in BIOS and repeat
  
Actual results:

System is shown and treated as a Pentium3. This is a bad thing.

Expected results:
System should be identified as a Nehalem

Additional info:

John Cooper help track this down to libvirt using the "nx" flag as one of the identifiers for a Nehalem. When disabled in the BIOS, the nx flag does not get set at the OS layer (see flag dump below). As this flag is controlled via BIOS we may want to consider dropping it from the check unless it is crucial to identifying some supported Nehalem functionality. (but in that case we probably want to detect a Nehalem but say that setting needs to get changed in the BIOS)


(nestled between the syscall and rdtscp flags)

Disabled
flags        : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid

Enabled
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm tpr_shadow vnmi flexpriority ept vpid

Comment 2 john cooper 2011-02-09 04:44:49 UTC
Taking this bug initially (even though it is filed against libvirt)
as I believe we first need to first address rhel6.1 qemu "xd"
disposition from where this behavior inherits.

Comment 3 Daniel Berrangé 2011-02-10 14:54:00 UTC
libvirt has a database of CPU models that is allows to be used in guest XML. When libvirt reports what CPU a host has (in virsh capabilities) it does *not* pay any attention to the name of the host CPUs. It simply gets the list of CPUID flags in the host CPU and then tries to find a CPU in its database with the best matching set of CPUID flags.

The libvirt database for Nehalem includes 'nx' as a flag. So when you disable NX in the host BIOS, the Nehalem entry in libvirt's database will no longer match. The next best match in this case turned out to be Pentium3.

In the guest XML for CPU model, it should be possible to express

    <cpu>
      <arch>x86_64</arch>
      <model>Nehalem</model>
      <feature name='nx' policy='disable'/>
    </cpu>

to request the Nehalem model from libvirt's databse, with the 'nx' flag blanked out.

virt-manager's 'Use host CPU' option would still end up setting Pentium3 though. To make this work, we would need the virsh capabilities to be able to report 'Nehalem', with the 'nx' flag removed. We don't have that ability yet.

I don't think any of this has a dependency on QEMU changes.

Comment 4 Jiri Denemark 2011-02-10 15:29:36 UTC
(In reply to comment #3)
> virt-manager's 'Use host CPU' option would still end up setting Pentium3
> though. To make this work, we would need the virsh capabilities to be able to
> report 'Nehalem', with the 'nx' flag removed. We don't have that ability yet.

This seems to be easily doable but I'm afraid it will fix some case while
breaking others with current modeling.

The problem is that if we allow features to be disabled in host CPU
capabilities, the code will choose the best match between a real CPU and a
model according to a number of features needed to be added/removed to the
model to get all features of the real CPU. That is if we have ModelA and
ModelB and a real CPU which should be identified as ModelA with additional
features which make it closer to the more capable ModelB, it will be
mis-identified as ModelB. I'm afraid this would actually be quite common since
the CPU models from cpu_map.xml contain less features than their real-world
equivalents.

To fix both cases, we would need to extend cpu_map.xml to contain more data
about each model which could be used for matching models with real CPUs.

Comment 5 john cooper 2011-02-14 17:40:41 UTC
(In reply to comment #0)

> When disabled in the BIOS, the nx flag does not get
> set at the OS layer (see flag dump below). As this flag is controlled via BIOS
> we may want to consider dropping it from the check unless it is crucial to
> identifying some supported Nehalem functionality. (but in that case we
> probably want to detect a Nehalem but say that setting needs to get
> changed in the BIOS)

Given the feedback from those with better market karma,
microsoft requires the user to enable "xd" in the bios in
order for its Hyper-V to boot:

    http://www.microsoft.com/hyper-v-server/en/us/system-requirements.aspx
    http://social.technet.microsoft.com/Forums/en-US/winserverhyperv/thread/5cf641c6-9467-4bcd-8331-1d8372ed40bb/

So for good or ill this arguably sets customer tolerance on this issue
assuming the cause of failure to launch is clearly communicated.

At the qemu CLI we do have a general facility to report when a
requested feature flag isn't present on the host.  Unfortunately
by default this requires the user to ask for the check -- it is
not currently enabled by default but really should be:

    https://bugzilla.redhat.com/show_bug.cgi?id=676631

So for now I think this can be closed as a doc issue in the scope of
6.1.  Doing so are any libvirt changes required?

Comment 6 Dor Laor 2011-02-17 13:11:38 UTC
(In reply to comment #5)
> So for now I think this can be closed as a doc issue in the scope of
> 6.1.  Doing so are any libvirt changes required?

Right, please add the tech notes and close it

Comment 7 john cooper 2011-02-18 16:30:53 UTC
Documentation issue, reassigning.  See comment #5 above
and subsequent pointers for reference.  Should fold into
the verbiage around the more comprehensive explanation
for case 664722.