Bug 1658407
| Summary: | mode="host-model" VMs include broken "arch-facilities" flag name [qemu-kvm] | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Eduardo Habkost <ehabkost> | |
| Component: | qemu-kvm | Assignee: | Eduardo Habkost <ehabkost> | |
| Status: | CLOSED ERRATA | QA Contact: | Yumei Huang <yuhuang> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | high | |||
| Version: | 7.7 | CC: | ailan, berrange, chayang, den, ehabkost, jinzhao, juzhang, libvirt-maint, mtessun, virt-bugs, virt-maint | |
| Target Milestone: | rc | Keywords: | Regression, ZStream | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | qemu-kvm-1.5.3-162.el7 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | 1658406 | |||
| : | 1664792 (view as bug list) | Environment: | ||
| Last Closed: | 2019-08-06 12:41:48 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1664792 | |||
|
Description
Eduardo Habkost
2018-12-12 00:55:07 UTC
(In reply to Eduardo Habkost from comment #0) > Cloning BZ for eventual qemu-kvm changes. [snip] > Additional actions that are (or were) being considered: > > On the QEMU side: removing the "arch-facilities" feature unconditionally. > This would prevent existing VMs using the "arch-facilities" feature from > running. > > On the QEMU side: keep "arch-facilities" working, but always report it as > unavailable. I'm not sure we want that. This could break libvirt feature > checks in the same way as removing the feature unconditionally. > > On the QEMU side: make "+arch-facilities" a no-op. This would keep existing > VMs working, but may confuse a running guest. I'm not sure this would > benefit users. I don't think any of these three options are viable. IIUC arch-facilities exists to solve real world problems. Even if there's no immediate need to expose it to a guest, sooner or later I expect we'll have valid reason to expose it. Blocking it now would mean we'll have to unblock it again later creating more work & pain. > On the QEMU side: make "+arch-facilities" a migration blocker. Probably a > good idea, to make sure customers update their configurations before trying > to live migrate. This looks sensible - we should always try to protect users to silently breaking their guests across migration. (In reply to Daniel Berrange from comment #3) > IIUC arch-facilities exists to solve real world problems. Even if there's no > immediate need to expose it to a guest, sooner or later I expect we'll have > valid reason to expose it. Blocking it now would mean we'll have to unblock > it again later creating more work & pain. "arch-facilities" is a name that doesn't exist upstream and we must stop using it. In the future we'll want to enable the new MSR, but code must use "arch-capabilities" for that. (In reply to Eduardo Habkost from comment #4) > (In reply to Daniel Berrange from comment #3) > > IIUC arch-facilities exists to solve real world problems. Even if there's no > > immediate need to expose it to a guest, sooner or later I expect we'll have > > valid reason to expose it. Blocking it now would mean we'll have to unblock > > it again later creating more work & pain. > > "arch-facilities" is a name that doesn't exist upstream and we must stop > using it. > In the future we'll want to enable the new MSR, but code must use > "arch-capabilities" > for that. I expect from libvirt's POV they'll be identical. We can only have one name for a given CPUID bit. Currently we used 'arch-facilities' as the name, but we'll have rename it to 'arch-capabilities' and do an XML parsing hack to cope with reading old XML files. Fix included in qemu-kvm-1.5.3-162.el7 For the record, we get to this problem while performing migration from Virtuozzo 7.0.8 (qemu-kvm-rhev-2.10.0-21.el7_5.7.src.rpm based) to Virtuozzo 7.0.10 (not released yet, qemu-kvm-rhev-2.12.0-18.el7_6.3.src.rpm based) in the following form: 2019-03-25 09:57:56.544+0000: starting up libvirt version: 4.5.0, package: 10.vz7.5 (Virtuozzo (http://www.virtuozzo.com/support/), 2019-03-22-14:43:36, builder11.eng.sw.ru), qemu version: 2.12.0qemu-kvm-vz-2.12.0-18.6.3.vz7.14, kernel: 3.10.0-957.10.1.vz7.85.2, hostname: s142.qa.sw.ru LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=VM_4892377a,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-VM_4892377a/master-key.aes -machine pc-i440fx-vz7.8.0,accel=kvm,usb=off,dump-guest-core=off -cpu SandyBridge-IBRS,vme=on,ss=on,pcid=on,hypervisor=on,arat=off,tsc_adjust=on,stibp=on,xsaveopt=on,arch-facilities=off,ssbd=off,vmx=off,+kvmclock -drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/NVRAM.dat,if=pflash,format=qcow2,unit=1 -m 2048 -realtime mlock=off -smp 2,sockets=1,cores=2,threads=1 -uuid ac7eed12-87bb-45f5-820b-da7f3f445465 -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=30,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=2019-03-25T09:57:52 -global kvm-pit.lost_tick_policy=discard -no-shutdown -boot strict=on -device nec-usb-xhci,id=usb,bus=pci.0,addr=0x5 -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x4 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/harddisk.hdd,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,discard=unmap,aio=native -device 'scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1,product=Vz HARDDISK0,write-cache=on,serial=8ae69a5cc50f4598b023' -drive if=none,id=drive-scsi0-0-2-0,readonly=on,cache=none,discard=unmap,aio=native -device 'scsi-cd,bus=scsi0.0,channel=0,scsi-id=2,lun=0,drive=drive-scsi0-0-2-0,id=scsi0-0-2-0,bootindex=2,product=Vz CD-ROM1,write-cache=on' -drive file=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/harddisk1.hdd,format=qcow2,if=none,id=drive-scsi0-0-1-0,cache=none,discard=unmap,aio=native -device 'scsi-hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,drive=drive-scsi0-0-1-0,id=scsi0-0-1-0,bootindex=4,product=Vz HARDDISK2,write-cache=on,serial=c18f88ef5b8d4440980c' -netdev tap,fd=31,id=hostnet0,vhost=on,vhostfd=32 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1c:42:98:4b:59,bus=pci.0,addr=0x3,bootindex=3 -chardev file,id=charserial0,path=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/serial.txt,append=on -device isa-serial,chardev=charserial0,id=serial0 -chardev file,id=charserial1,path=/vz/vmprivate/ac7eed12-87bb-45f5-820b-da7f3f445465/serial1.txt,append=on -device isa-serial,chardev=charserial1,id=serial1 -chardev socket,id=charchannel0,fd=33,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev socket,id=charchannel1,fd=34,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=org.qemu.guest_agent.1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc '[::1]:0,websocket=5700' -device VGA,id=video0,vgamem_mb=32,bus=pci.0,addr=0x2 -incoming defer -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7,deflate-on-oom=on -device vmcoreinfo -d guest_errors,unimp -global isa-debugcon.iobase=0x402 -debugcon file:/var/log/libvirt/qemu/VM_4892377a.qdbg.log -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -device pvpanic,ioport=1285 -msg timestamp=on 2019-03-25 09:57:56.544+0000: Domain id=1 is tainted: high-privileges 2019-03-25 09:57:56.544+0000: Domain id=1 is tainted: custom-argv 2019-03-25T09:57:56.674265Z qemu-kvm: can't apply global SandyBridge-IBRS-x86_64-cpu.arch-facilities=off: Property '.arch-facilities' not found 2019-03-25 09:57:56.676+0000: shutting down, reason=failed Thus qemu-kvm is also definitely affected as libvirt could force selection of this feature to off. appropriate domain.xml section is like the following
<cpu mode='host-model' check='partial'>
<model fallback='allow'>SandyBridge-IBRS</model>
<topology sockets='1' cores='2' threads='1'/>
<feature policy='require' name='vme'/>
<feature policy='require' name='ss'/>
<feature policy='require' name='pcid'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='tsc_adjust'/>
<feature policy='require' name='stibp'/>
<feature policy='require' name='xsaveopt'/>
<feature policy='disable' name='arch-facilities'/>
<feature policy='disable' name='ssbd'/>
<feature policy='disable' name='arat'/>
<feature policy='disable' name='vmx'/>
</cpu>
Though this is not reproduced directly on RH, but the feature could appear in this form in domain.xml
(In reply to Denis V. Lunev from comment #15) > <feature policy='disable' name='arch-facilities'/> The root cause is the inclusion of arch-facilities in the domain XML, which should never happen on a RHEL-7.5 host. arch-facilities was never supposed to be enabled anywhere, and the original libvirt bug (bug 1658406) is supposed to fix that on the libvirt side. Oh, I just noticed that the XML says policy='disable'. I'm not sure if this is reproducible on a RHEL-7 host. Do you know if it's possible to get this specific domain XML snippet generated by libvirt in a RHEL host? yep. I am not sure too. For us this happens as software on top of libvirt has asked all supported features and put them down into the domain.xml. Interfaces for this are available in RH. Thus if there is other similar software - the problem will reappear. The question whether it exists? :) We are going to drop 'arch-facilities' option in by libvirt itself in our next release and not kludge the QEMU itself. Verify: qemu-kvm-1.5.3-164.el7 Guest kernel: 3.10.0-1040.el7.x86_64 Host kernel: 3.10.0-1040.el7.x86_64 Using host with flag "arch_capabilities", # lscpu ... Flags: ..arch_capabilities 1. Boot guest with -cpu Skylake-Server,+arch-facilities, get a warning. (qemu) qemu-kvm: WARNING: the arch-facilities CPU feature is deprecated and does not support live migration 2. Boot guest with -cpu host, check flags in guest, no "arch_capabilities". BTW, I can reproduce the issue in comment 14 with qemu-kvm-rhev-2.12.0-26.el7. # /usr/libexec/qemu-kvm -cpu Skylake-Server-IBRS,arch-facilities=off qemu-kvm: can't apply global Skylake-Server-IBRS-x86_64-cpu.arch-facilities=off: Property '.arch-facilities' not found Yes, but this is uninteresting case for the RedHat most likely. The real question was whether it was possible to get domain.xml like I have received with "disabled" feature or no. Eduardo thinks that the answer is 'no' thus the rest is of pure academic interest. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:2078 |