Bug 1474874 - libvirt qemu cache needs to be cleared to notice kvm module nested= setting change
libvirt qemu cache needs to be cleared to notice kvm module nested= setting c...
Status: NEW
Product: Fedora
Classification: Fedora
Component: libvirt (Show other bugs)
26
Unspecified Unspecified
unspecified Severity urgent
: ---
: ---
Assigned To: Libvirt Maintainers
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-25 10:30 EDT by jniederm
Modified: 2017-08-06 16:09 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
log_and_xml_dump.tar.gz (4.33 KB, application/x-gzip)
2017-07-25 10:30 EDT, jniederm
no flags Details

  None (edit)
Description jniederm 2017-07-25 10:30:36 EDT
Created attachment 1304262 [details]
log_and_xml_dump.tar.gz

Description of problem:
Nested virtualization doesn't work, virtualized host doesn't get 'vmx' cpu flag.

Version-Release number of selected component (if applicable):
libvirt-daemon-config-network-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-storage-scsi-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-network-3.5.0-1.fc26.x86_64
libvirt-gconfig-1.0.0-2.fc26.x86_64
libvirt-libs-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-nodedev-3.5.0-1.fc26.x86_64
libvirt-glib-1.0.0-2.fc26.x86_64
libvirt-daemon-driver-storage-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-storage-core-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-storage-iscsi-3.5.0-1.fc26.x86_64
libvirt-daemon-kvm-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-storage-rbd-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-nwfilter-3.5.0-1.fc26.x86_64
libvirt-gobject-1.0.0-2.fc26.x86_64
libvirt-daemon-driver-qemu-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-storage-disk-3.5.0-1.fc26.x86_64
libvirt-daemon-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-storage-logical-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-interface-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-secret-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-storage-gluster-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-storage-mpath-3.5.0-1.fc26.x86_64
libvirt-python-3.5.0-2.fc26.x86_64
libvirt-client-3.5.0-1.fc26.x86_64
libvirt-daemon-driver-storage-sheepdog-3.5.0-1.fc26.x86_64
qemu-img-2.9.0-7.fc26.x86_64
qemu-block-ssh-2.9.0-7.fc26.x86_64
qemu-system-x86-2.9.0-7.fc26.x86_64
qemu-common-2.9.0-7.fc26.x86_64
qemu-block-iscsi-2.9.0-7.fc26.x86_64
qemu-guest-agent-2.9.0-7.fc26.x86_64
qemu-block-rbd-2.9.0-7.fc26.x86_64
qemu-system-x86-core-2.9.0-7.fc26.x86_64
qemu-block-dmg-2.9.0-7.fc26.x86_64
ipxe-roms-qemu-20161108-2.gitb991c67.fc26.noarch
qemu-block-curl-2.9.0-7.fc26.x86_64
libvirt-daemon-driver-qemu-3.5.0-1.fc26.x86_64
qemu-kvm-2.9.0-7.fc26.x86_64
qemu-block-nfs-2.9.0-7.fc26.x86_64
qemu-block-gluster-2.9.0-7.fc26.x86_64


How reproducible:
100%

Steps to Reproduce:
1. Install virt-manager
2. Make sure virtualization is enabled (`grep vmx /proc/cpuinfo` finds something)
3. Make sure nested virtualization is enabled (`cat /sys/module/kvm_intel/parameters/nested` prints 'Y', https://fedoraproject.org/wiki/How_to_enable_nested_virtualization_in_KVM ) 
4. Create a VM in virt-manager, run it, install latest CentOS in it
5. Check that cpu flax vmx is available inside the vm by: `grep vmx /proc/cpuinfo`

Actual results:
grep finds nothing

Expected results:
CPU of the VM has a 'vmx' flag

Additional info:
libvirt xml or stopped and running VM attached
Comment 1 Daniel Berrange 2017-07-25 10:35:15 EDT
The CPU model shown is

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Haswell-noTSX</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='f16c'/>
    <feature policy='require' name='rdrand'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='xsaveopt'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='require' name='abm'/>
  </cpu>


The Haswell-noTSX model does not include 'vmx' and nor is it listed explicitly. So this is simply a configuration problem.

I don't know whether virt-manager explicitly leaves out vmx or not.
Comment 2 jniederm 2017-07-25 10:51:54 EDT
Is it really problem of virt-manager, when the cpu tag looks like

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

before the VM is started and is changed to the block mentioned in comment 1 only for the time of run of the VM? Or am I missing something?
Comment 3 jniederm 2017-07-25 11:05:05 EDT
It looks like replacing block mentioned in comment 2 by

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Haswell-noTSX</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='vmx'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='f16c'/>
    <feature policy='require' name='rdrand'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='xsaveopt'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='require' name='abm'/>
  </cpu>

when VM is down is a workaround (for Haswell-noTSX CPUs).
Comment 4 Pavel Hrdina 2017-07-25 11:11:24 EDT
Moving back to libvirt since this has nothing to do with virt-manager.  Libvirt is the one who replaces the

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

with

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Haswell-noTSX</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='f16c'/>
    <feature policy='require' name='rdrand'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='xsaveopt'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='require' name='abm'/>
  </cpu>
Comment 5 Pavel Hrdina 2017-07-25 11:19:23 EDT
What happens with libvirt is that you've probably started libvirtd before the
nested virtualization was enabled.  To speedup libvirtd startup it caches
capabilities into "/var/cache/libvirt/qemu/capabilities/" and these cached
capabilities are valid until you update/downgrade libvirt/qemu.

Even though you've enabled nested virtualization the cached capabilities are
still valid, because there was no change of libvirt/qemu binaries, libvirt uses
these cached capabilities where "vmx" feature is marked as unavailable.

To fix this issue you can simply remove all files inside the cache folder
"/var/cache/libvirt/qemu/capabilities/" and restart libvirtd.  After this step
the:

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>
  </cpu>

should work as expected.
Comment 6 Petr Kotas 2017-07-25 11:43:06 EDT
(In reply to Pavel Hrdina from comment #5)
> What happens with libvirt is that you've probably started libvirtd before the
> nested virtualization was enabled.  To speedup libvirtd startup it caches
> capabilities into "/var/cache/libvirt/qemu/capabilities/" and these cached
> capabilities are valid until you update/downgrade libvirt/qemu.
> 
> Even though you've enabled nested virtualization the cached capabilities are
> still valid, because there was no change of libvirt/qemu binaries, libvirt
> uses
> these cached capabilities where "vmx" feature is marked as unavailable.
> 
> To fix this issue you can simply remove all files inside the cache folder
> "/var/cache/libvirt/qemu/capabilities/" and restart libvirtd.  After this
> step
> the:
> 
>   <cpu mode='host-model' check='partial'>
>     <model fallback='allow'/>
>   </cpu>
> 
> should work as expected.

I had the same issue. I can verify that this solution works. Clearing the cache worked for me.
Comment 7 jniederm 2017-07-25 16:31:55 EDT
Deleting files in "/var/cache/libvirt/qemu/capabilities/" and restarting libvirtd as mentioned in comment 5 works for me as well.
Comment 8 Cole Robinson 2017-08-03 17:12:26 EDT
By 'enabling nested' are you guys talking about the /etc/modprobe.d/kvm.conf kvm_intel nested=1 bit? If so, seems like this is going to be a recurring problem that would be nice to find a fix, but not sure how we are going to know to recache CPU data when a module parameter changes...
Comment 9 jniederm 2017-08-03 17:38:07 EDT
Hi Cole, yes 'enabling nested' refers to re-inserting kvm_intel module with 'nested' bit set, provided host CPU already has virtualization enabled in BIOS. A solution checking could be checking '/sys/module/kvm_intel/parameters/nested' and invalidating cache in case of change.

In any case it is worth documenting that one needs to delete the cache once the module is reinserted. I find that quite difficult to expect.

Note You need to log in before you can comment on or make changes to this bug.