Bug 1658407 - mode="host-model" VMs include broken "arch-facilities" flag name [qemu-kvm]
Summary: mode="host-model" VMs include broken "arch-facilities" flag name [qemu-kvm]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm
Version: 7.7
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: ---
Assignee: Eduardo Habkost
QA Contact: Yumei Huang
URL:
Whiteboard:
Depends On:
Blocks: 1664792
TreeView+ depends on / blocked
 
Reported: 2018-12-12 00:55 UTC by Eduardo Habkost
Modified: 2019-08-06 12:42 UTC (History)
11 users (show)

Fixed In Version: qemu-kvm-1.5.3-162.el7
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1658406
: 1664792 (view as bug list)
Environment:
Last Closed: 2019-08-06 12:41:48 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:2078 None None None 2019-08-06 12:42:46 UTC

Description Eduardo Habkost 2018-12-12 00:55:07 UTC
Cloning BZ for eventual qemu-kvm changes.

+++ This bug was initially created as a clone of Bug #1658406 +++

Description of problem:

When using mode="host-model" on a 7.6 host, libvirt may inadvertently enable the "arch-facilities" feature in the domain XML.

"arch-facilities" was a feature that was never included in the 7.5 kernel, but was added to the CPU feature table in qemu-kvm.  This was not a problem in RHEL-7.5 because KVM never reported the feature on GET_SUPPORTED_CPUID.

The upstream kernel added support to ARCH_CAPABILITIES later.  QEMU called the feature "arch-capabilities", but live-migration support for the feature isn't available yet (neither upstream or in the RHEL qemu-kvm package).

Then the 7.6 kernel was updated, and ARCH_CAPABILITIES was added to GET_SUPPORTED_CPUID.  Now qemu-kvm reports arch-facilities as a supported feature, mode="host-model" will enable arch-facilities in the domain XML, and we have a VM that can't be safely live-migrated.


Version-Release number of selected component (if applicable):

libvirt-4.5.0-10.el7
qemu-kvm-1.5.3-160.el7
kernel-3.10.0-957.el7


Steps to Reproduce:
Create VM with mode="host-model" on a host having arch_capabilities on /proc/cpuinfo

Actual results:
"arch-facilities" is in the domain XML after the VM is started.

Expected results:
"arch-facilities" should never be enabled in a VM.


Additional info:
It will be difficult to fix this without breaking existing VM configurations. But fixing it as soon as possible in a 7.6.z update will make sure less customers will have new VMs created with the broken arch-facilities CPU feature.

Problem was detected by code analysis while reviewing the patch for bug 1633150.  I didn't reproduce it yet.


Proposed solution:
On the libvirt side: make mode="host-model" never enable arch-facilities.


Additional actions that are (or were) being considered:

On the QEMU side: removing the "arch-facilities" feature unconditionally.  This would prevent existing VMs using the "arch-facilities" feature from running.

On the QEMU side: keep "arch-facilities" working, but always report it as unavailable.  I'm not sure we want that.  This could break libvirt feature checks in the same way as removing the feature unconditionally.

On the QEMU side: make "+arch-facilities" a no-op.  This would keep existing VMs working, but may confuse a running guest.  I'm not sure this would benefit users.

On the QEMU side: make "+arch-facilities" a migration blocker.  Probably a good idea, to make sure customers update their configurations before trying to live migrate.

Comment 3 Daniel Berrangé 2018-12-13 11:00:01 UTC
(In reply to Eduardo Habkost from comment #0)
> Cloning BZ for eventual qemu-kvm changes.

[snip]

> Additional actions that are (or were) being considered:
> 
> On the QEMU side: removing the "arch-facilities" feature unconditionally. 
> This would prevent existing VMs using the "arch-facilities" feature from
> running.
> 
> On the QEMU side: keep "arch-facilities" working, but always report it as
> unavailable.  I'm not sure we want that.  This could break libvirt feature
> checks in the same way as removing the feature unconditionally.
> 
> On the QEMU side: make "+arch-facilities" a no-op.  This would keep existing
> VMs working, but may confuse a running guest.  I'm not sure this would
> benefit users.

I don't think any of these three options are viable.  

IIUC arch-facilities exists to solve real world problems. Even if there's no immediate need to expose it to a guest, sooner or later I expect we'll have valid reason to expose it. Blocking it now would mean we'll have to unblock it again later creating more work & pain. 

> On the QEMU side: make "+arch-facilities" a migration blocker.  Probably a
> good idea, to make sure customers update their configurations before trying
> to live migrate.

This looks sensible - we should always try to protect users to silently breaking their guests across migration.

Comment 4 Eduardo Habkost 2018-12-13 15:23:30 UTC
(In reply to Daniel Berrange from comment #3)
> IIUC arch-facilities exists to solve real world problems. Even if there's no
> immediate need to expose it to a guest, sooner or later I expect we'll have
> valid reason to expose it. Blocking it now would mean we'll have to unblock
> it again later creating more work & pain. 

"arch-facilities" is a name that doesn't exist upstream and we must stop using it.
In the future we'll want to enable the new MSR, but code must use "arch-capabilities"
for that.

Comment 5 Daniel Berrangé 2018-12-13 15:31:06 UTC
(In reply to Eduardo Habkost from comment #4)
> (In reply to Daniel Berrange from comment #3)
> > IIUC arch-facilities exists to solve real world problems. Even if there's no
> > immediate need to expose it to a guest, sooner or later I expect we'll have
> > valid reason to expose it. Blocking it now would mean we'll have to unblock
> > it again later creating more work & pain. 
> 
> "arch-facilities" is a name that doesn't exist upstream and we must stop
> using it.
> In the future we'll want to enable the new MSR, but code must use
> "arch-capabilities"
> for that.

I expect from libvirt's POV they'll be identical. We can only have one name for a given CPUID bit. Currently we used 'arch-facilities' as the name, but we'll have rename it to 'arch-capabilities' and do an XML parsing hack to cope with reading old XML files.

Comment 12 Miroslav Rezanina 2019-01-14 06:47:25 UTC
Fix included in qemu-kvm-1.5.3-162.el7

Comment 14 Denis V. Lunev 2019-03-26 10:41:35 UTC
For the record, we get to this problem while performing migration from Virtuozzo 7.0.8 (qemu-kvm-rhev-2.10.0-21.el7_5.7.src.rpm based) to Virtuozzo 7.0.10 (not released yet, qemu-kvm-rhev-2.12.0-18.el7_6.3.src.rpm based) in the following form:


2019-03-25 09:57:56.544+0000: starting up libvirt version: 4.5.0, package: 10.vz7.5 (Virtuozzo (http://www.virtuozzo.com/support/), 2019-03-22-14:43:36, builder11.eng.sw.ru), qemu version: 2.12.0qemu-kvm-vz-2.12.0-18.6.3.vz7.14, kernel: 3.10.0-957.10.1.vz7.85.2, hostname: s142.qa.sw.ru
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=VM_4892377a,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-VM_4892377a/master-key.aes -machine pc-i440fx-vz7.8.0,accel=kvm,usb=off,dump-guest-core=off -cpu SandyBridge-IBRS,vme=on,ss=on,pcid=on,hypervisor=on,arat=off,tsc_adjust=on,stibp=on,xsaveopt=on,arch-facilities=off,ssbd=off,vmx=off,+kvmclock -drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/NVRAM.dat,if=pflash,format=qcow2,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=1,cores=2,threads=1 -uuid ac7eed12-87bb-45f5-820b-da7f3f445465 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=30,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2019-03-25T09:57:52 -global kvm-pit.lost_tick_policy=discard -no-shutdown -boot strict=on -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x5 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/harddisk.hdd,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,discard=unmap,aio=native -device 'scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,product=Vz HARDDISK0,write-cache=on,serial=8ae69a5cc50f4598b023' -drive if=none,id=drive-scsi0-0-2-0,readonly=on,cache=none,discard=unmap,aio=native -device 'scsi-cd,bus=scsi0.0,channel=0,scsi-id=2,lun=0,drive=drive-scsi0-0-2-0,id=scsi0-0-2-0,bootindex=2,product=Vz CD-ROM1,write-cache=on' -drive file=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/harddisk1.hdd,format=qcow2,if=none,id=drive-scsi0-0-1-0,cache=none,discard=unmap,aio=native -device 'scsi-hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,drive=drive-scsi0-0-1-0,id=scsi0-0-1-0,bootindex=4,product=Vz HARDDISK2,write-cache=on,serial=c18f88ef5b8d4440980c' -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=32 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1c:42:98:4b:59,bus=pci.0,addr=0x3,bootindex=3 -chardev file,id=charserial0,path=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/serial.txt,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev file,id=charserial1,path=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/serial1.txt,append=on -device isa-serial,chardev=charserial1,id=serial1 -chardev socket,id=charchannel0,fd=33,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev socket,id=charchannel1,fd=34,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc '[::1]:0,websocket=5700' -device VGA,id=video0,vgamem_mb=32,bus=pci.0,addr=0x2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7,deflate-on-oom=on -device vmcoreinfo -d guest_errors,unimp -global isa-debugcon.iobase=0x402 -debugcon file:/var/log/libvirt/qemu/VM_4892377a.qdbg.log -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -device pvpanic,ioport=1285 -msg timestamp=on
2019-03-25 09:57:56.544+0000: Domain id=1 is tainted: high-privileges
2019-03-25 09:57:56.544+0000: Domain id=1 is tainted: custom-argv
2019-03-25T09:57:56.674265Z qemu-kvm: can't apply global SandyBridge-IBRS-x86_64-cpu.arch-facilities=off: Property '.arch-facilities' not found
2019-03-25 09:57:56.676+0000: shutting down, reason=failed

Thus qemu-kvm is also definitely affected as libvirt could force selection of this feature to off.

Comment 15 Denis V. Lunev 2019-03-26 10:43:33 UTC
appropriate domain.xml section is like the following

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'>SandyBridge-IBRS</model>
    <topology sockets='1' cores='2' threads='1'/>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='pcid'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='stibp'/>
    <feature policy='require' name='xsaveopt'/>
    <feature policy='disable' name='arch-facilities'/>
    <feature policy='disable' name='ssbd'/>
    <feature policy='disable' name='arat'/>
    <feature policy='disable' name='vmx'/>
  </cpu>

Though this is not reproduced directly on RH, but the feature could appear in this form in domain.xml

Comment 16 Eduardo Habkost 2019-03-29 19:59:19 UTC
(In reply to Denis V. Lunev from comment #15)
>     <feature policy='disable' name='arch-facilities'/>

The root cause is the inclusion of arch-facilities in the domain XML, which should never happen on a RHEL-7.5 host.  arch-facilities was never supposed to be enabled anywhere, and the original libvirt bug (bug 1658406) is supposed to fix that on the libvirt side.

Comment 17 Eduardo Habkost 2019-03-29 20:03:07 UTC
Oh, I just noticed that the XML says policy='disable'.  I'm not sure if this is reproducible on a RHEL-7 host.  Do you know if it's possible to get this specific domain XML snippet generated by libvirt in a RHEL host?

Comment 18 Denis V. Lunev 2019-03-29 20:14:35 UTC
yep. I am not sure too.

For us this happens as software on top of libvirt has asked all supported features and put them down into the domain.xml. Interfaces for this are available in RH. Thus if there is other similar software - the problem will reappear. The question whether it exists? :)

We are going to drop 'arch-facilities' option in by libvirt itself in our next release and not kludge the QEMU itself.

Comment 19 Yumei Huang 2019-04-23 02:44:09 UTC
Verify:
qemu-kvm-1.5.3-164.el7
Guest kernel: 3.10.0-1040.el7.x86_64
Host kernel: 3.10.0-1040.el7.x86_64

Using host with flag "arch_capabilities",

# lscpu
...
Flags: ..arch_capabilities


1. Boot guest with -cpu Skylake-Server,+arch-facilities,  get a warning.

(qemu) qemu-kvm: WARNING: the arch-facilities CPU feature is deprecated and does not support live migration


2. Boot guest with -cpu host, check flags in guest, no "arch_capabilities".

Comment 20 Yumei Huang 2019-04-23 02:51:21 UTC
BTW, I can reproduce the issue in comment 14 with qemu-kvm-rhev-2.12.0-26.el7.

# /usr/libexec/qemu-kvm -cpu Skylake-Server-IBRS,arch-facilities=off
qemu-kvm: can't apply global Skylake-Server-IBRS-x86_64-cpu.arch-facilities=off: Property '.arch-facilities' not found

Comment 21 Denis V. Lunev 2019-04-23 14:43:53 UTC
Yes, but this is uninteresting case for the RedHat most likely. The real question was whether it was possible to get domain.xml like I have received with "disabled" feature or no. Eduardo thinks that the answer is 'no' thus the rest is of pure academic interest.

Comment 23 errata-xmlrpc 2019-08-06 12:41:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2078


Note You need to log in before you can comment on or make changes to this bug.