Bug 1485022
| Summary: | Guest CPU in an offline snapshot changes from host-model to custom | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Strahil Nikolov <hunter86_bg> | ||||
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Luyao Huang <lhuang> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 7.4 | CC: | bugzilla, dhill, dyuan, hhan, hunter86_bg, jsuchane, juzhou, kchamart, kuwei, lmen, mxie, rbalakri, tzheng, xiaodwan, xuzhang, yalzhang | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | x86_64 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-3.8.0-1.el7 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-04-10 10:55:31 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1199452 | ||||||
| Attachments: |
|
||||||
Moving to libvirt since this isn't a virt-manager issue. Update: Tried again with cpu model "Opteron_G5" and the issue happened again (VM is newly built). How to reproduce: 1. Shutdown gracefully the VM 2. Create snapshot 3. Run selected snapshot Version of libvirt and accompanying software: fence-virtd-libvirt-0.3.2-12.el7.x86_64 libvirt-3.2.0-14.el7_4.3.x86_64 libvirt-client-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-config-network-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-config-nwfilter-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-interface-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-lxc-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-network-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-nodedev-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-nwfilter-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-qemu-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-secret-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-storage-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-storage-core-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-storage-disk-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-storage-gluster-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-storage-iscsi-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-storage-logical-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-storage-mpath-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-storage-rbd-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-driver-storage-scsi-3.2.0-14.el7_4.3.x86_64 libvirt-daemon-kvm-3.2.0-14.el7_4.3.x86_64 libvirt-gconfig-1.0.0-1.el7.x86_64 libvirt-glib-1.0.0-1.el7.x86_64 libvirt-gobject-1.0.0-1.el7.x86_64 libvirt-libs-3.2.0-14.el7_4.3.x86_64 libvirt-python-3.2.0-3.el7.x86_64 Workaround - select another CPU model,apply and then return to the original one. A simple reproducer where the guest CPU model is changed from 'host-model' to 'custom', after an offline, internal snapshot is created:
Check the current CPU mode on the offline guest:
$ virsh dumpxml cvm1 | grep host-model
<cpu mode='host-model' check='partial'>
Create an offline, internal snapshot:
$ virsh snapshot-create-as cvm1 offline-int2
Domain snapshot offline-int2 created
Check the snapshot metadata for what CPU mode it has (it now has
'custom'):
$ virsh snapshot-dumpxml cvm1 offline-int2 | grep custom
<cpu mode='custom' match='exact' check='partial'>
Start the guest:
$ virsh start cvm1
Domain cvm1 started
Now check for the 'host-model' CPU mode, it is no longer present:
$ virsh dumpxml cvm1 | grep host-model
$ echo $?
1
Instead, you see the 'custom' CPU mode:
$ virsh dumpxml cvm1 | grep custom -A14
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>Haswell-noTSX</model>
<vendor>Intel</vendor>
<feature policy='require' name='vme'/>
<feature policy='require' name='ss'/>
<feature policy='require' name='vmx'/>
<feature policy='require' name='f16c'/>
<feature policy='require' name='rdrand'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='arat'/>
<feature policy='require' name='tsc_adjust'/>
<feature policy='require' name='xsaveopt'/>
<feature policy='require' name='pdpe1gb'/>
<feature policy='require' name='abm'/>
</cpu>
(In reply to Kashyap Chamarthy from comment #6) > A simple reproducer where the guest CPU model is changed from 'host-model' > to 'custom', after an offline, internal snapshot is created: > > > Check the current CPU mode on the offline guest: > > $ virsh dumpxml cvm1 | grep host-model > <cpu mode='host-model' check='partial'> > > Create an offline, internal snapshot: > > $ virsh snapshot-create-as cvm1 offline-int2 > Domain snapshot offline-int2 created > > Check the snapshot metadata for what CPU mode it has (it now has > 'custom'): > > $ virsh snapshot-dumpxml cvm1 offline-int2 | grep custom > <cpu mode='custom' match='exact' check='partial'> This is enough to reproduce the issue. > Start the guest: > > $ virsh start cvm1 > Domain cvm1 started > > Now check for the 'host-model' CPU mode, it is no longer present: > > $ virsh dumpxml cvm1 | grep host-model > $ echo $? > 1 This is useless. A running domain will never have a host-model in its live XML. This is similar to bug 1473516, but as there are two issues which need to be fixed, I'll keep both bugs open. The first issue is the addition of unsupported or unknown CPU features when updating inactive guest CPU definition (virsh dumpxml --inactive --update-cpu). This will be covered by bug 1473516. The second issue causes libvirt to update guest CPU when creating an offline snapshot, which is not expected. The domain is not running and thus we don't need to keep exact ABI of the guest CPU. Thus CPUs in offline snapshots should not be updated at all. And this issue is cover by this BZ. Comment 6 shows an easy reproducer. To sum up: 1. define a domain with <cpu mode='host-model'/> 2. while the domain is NOT running, take its snapshot virsh snapshot-create-as $DOM snap 3. virsh snapshot-dumpxml $DOM snap Once this bug is fixed, the XML returned in snap 3 should contain a CPU with mode='host-model'. Oops, there is a wrong bug number in comment 8. It should have referenced bug 1481309 Patches sent upstream for review: https://www.redhat.com/archives/libvir-list/2017-September/msg00517.html Both issues mentioned in comment 8 are fixed upstream by commit 7e874326a3eca1233017ab91774d845b99869af1 Refs: v3.7.0-150-g7e874326a3 Author: Jiri Denemark <jdenemar> AuthorDate: Fri Jun 30 17:05:22 2017 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Thu Sep 21 15:27:39 2017 +0200 qemu: Use correct host model for updating guest cpu When a user requested a domain XML description with VIR_DOMAIN_XML_UPDATE_CPU flag, libvirt would use the host CPU definition from host capabilities rather than the one which will actually be used once the domain is started. https://bugzilla.redhat.com/show_bug.cgi?id=1481309 Signed-off-by: Jiri Denemark <jdenemar> commit 06f75ff2cb292e2658b4f2f6949c700550006272 Refs: v3.7.0-151-g06f75ff2cb Author: Jiri Denemark <jdenemar> AuthorDate: Fri Jun 30 16:55:20 2017 +0200 Commit: Jiri Denemark <jdenemar> CommitDate: Thu Sep 21 15:27:39 2017 +0200 qemu: Don't update CPU when formatting live def Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU in domain's live definition and there's no need to update it every time we want to format the definition. The commit itself tried to address this in qemuDomainFormatXML, but forgot to fix qemuDomainDefFormatLive. Not to mention that masking a previously set flag is only acceptable if the flag was set by a public API user. Internally, libvirt should have never set the flag in the first place. https://bugzilla.redhat.com/show_bug.cgi?id=1485022 Signed-off-by: Jiri Denemark <jdenemar> *** Bug 1501341 has been marked as a duplicate of this bug. *** Verify this bug with libvirt-3.9.0-6.el7.x86_64:
1. prepare a guest with host-model cpu mode:
# virsh dumpxml r7-mig
<cpu mode='host-model' check='partial'>
<model fallback='allow'/>
<numa>
<cell id='0' cpus='0-2' memory='524288' unit='KiB'/>
<cell id='1' cpus='3-5' memory='524288' unit='KiB'/>
</numa>
</cpu>
2. create snapshot:
# virsh snapshot-create-as r7-mig s1
Domain snapshot s1 created
# virsh snapshot-list r7-mig
Name Creation Time State
------------------------------------------------------------
s1 2017-12-20 03:42:58 -0500 shutoff
3. check the snapshot xml:
# virsh snapshot-dumpxml r7-mig s1 | grep -A5 "cpu mode"
<cpu mode='host-model' check='partial'>
<model fallback='allow'/>
<numa>
<cell id='0' cpus='0-2' memory='524288' unit='KiB'/>
<cell id='1' cpus='3-5' memory='524288' unit='KiB'/>
</numa>
4. revert snapshot and recheck the guest xml:
# virsh snapshot-revert r7-mig --current
# virsh dumpxml r7-mig |grep -A5 "cpu mode"
<cpu mode='host-model' check='partial'>
<model fallback='allow'/>
<numa>
<cell id='0' cpus='0-2' memory='524288' unit='KiB'/>
<cell id='1' cpus='3-5' memory='524288' unit='KiB'/>
</numa>
5. start guest and recheck xml:
# virsh start r7-mig
Domain r7-mig started
# virsh dumpxml r7-mig |grep -A20 "cpu mode"
<cpu mode='custom' match='exact' check='full'>
<model fallback='forbid'>Opteron_G5</model>
<vendor>AMD</vendor>
<feature policy='require' name='vme'/>
<feature policy='require' name='x2apic'/>
<feature policy='require' name='tsc-deadline'/>
<feature policy='require' name='hypervisor'/>
<feature policy='require' name='arat'/>
<feature policy='require' name='tsc_adjust'/>
<feature policy='require' name='bmi1'/>
<feature policy='require' name='mmxext'/>
<feature policy='require' name='fxsr_opt'/>
<feature policy='require' name='cmp_legacy'/>
<feature policy='require' name='cr8legacy'/>
<feature policy='require' name='osvw'/>
<feature policy='disable' name='svm'/>
<feature policy='disable' name='rdtscp'/>
<numa>
<cell id='0' cpus='0-2' memory='524288' unit='KiB'/>
<cell id='1' cpus='3-5' memory='524288' unit='KiB'/>
</numa>
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:0704 |
Created attachment 1317911 [details] Virt-Manager screenshot Description of problem: After snapshot cannot start a VM as automatically is added 'invtsc' to the CPU section of the configuration Version-Release number of selected component (if applicable): libvirt-3.2.0-14.el7_4.2.x86_64 virt-manager-1.4.1-7.el7.noarch virt-manager-common-1.4.1-7.el7.noarch How reproducible: Always Steps to Reproduce: 1.Set CPU to "Copy host CPU configuration" (FX-8350 in my case) 2.Run and shutdown the VM 3.Take a snapshot 4.Restore from snapshot 5.Start the VM Actual results: Cannot start the VM due to: Error starting domain: unsupported configuration: host doesn't support invariant TSC Traceback (most recent call last): File "/usr/share/virt-manager/virtManager/asyncjob.py", line 88, in cb_wrapper callback(asyncjob, *args, **kwargs) File "/usr/share/virt-manager/virtManager/asyncjob.py", line 124, in tmpcb callback(*args, **kwargs) File "/usr/share/virt-manager/virtManager/libvirtobject.py", line 83, in newfn ret = fn(self, *args, **kwargs) File "/usr/share/virt-manager/virtManager/domain.py", line 1489, in startup self._backend.create() File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1039, in create if ret == -1: raise libvirtError ('virDomainCreate() failed', dom=self) libvirtError: unsupported configuration: host doesn't support invariant TSC Expected results: VM to start with the configuration before the snapshot. The following should not be added to VM's xml: <feature policy='require' name='invtsc'/> Additional info: Workaround - select another CPU type , save and restore to "Copy host CPU configuration"