Bug 1508549
| Summary: | Post libvirt upgrade to 3.2.0-14, migration fails with -- "can't apply global Haswell-noTSX-x86_64-cpu.cmt=off: Property '.cmt' not found" [rhel-7.4.z] | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Oneata Mircea Teodor <toneata> |
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> |
| Status: | CLOSED ERRATA | QA Contact: | zhe peng <zpeng> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 7.2 | CC: | aglotov, anrussel, berrange, chhu, dasmith, dhill, dvd, dyuan, eglynn, jbryant, jdenemar, jherrman, jishao, kchamart, lhuang, mkalinin, pneedle, rbalakri, rbryant, rhodain, sbauza, sferdjao, sgordon, smykhail, srevivo, thomas.oulevey, toneata, vromanso, xuzhang, zpeng |
| Target Milestone: | rc | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-3.2.0-14.el7_4.4 | Doc Type: | Bug Fix |
| Doc Text: |
Previously, the libvirt service in some cases added the "cmt" CPU feature incompatible with the QEMU emulator to KVM guest virtual machines with CPU set to "host-model". As a consequence, migrating or restoring these guests failed. With this update, libvirt no longer adds "cmt" to domain features and automatically removes "cmt" from guest configuration if present. As a result, the affected guests can be migrated and restored correctly.
|
Story Points: | --- |
| Clone Of: | 1495171 | Environment: | |
| Last Closed: | 2017-11-30 16:06:34 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1495171 | ||
| Bug Blocks: | |||
|
Description
Oneata Mircea Teodor
2017-11-01 16:14:29 UTC
verify with build:
libvirt-3.2.0-14.el7_4.4.x86_64
qemu-kvm-rhev-2.9.0-16.el7_4.10.x86_64
step:
1. prepare a cmt host with build:
libvirt-1.2.17-13.el7_2.6.x86_64
qemu-kvm-rhev-2.3.0-31.el7_2.19.x86_64
2. create a guest with cpu model 'host-model'
....
<cpu mode='host-model'>
<model fallback='allow'>Haswell-noTSX</model>
<vendor>Intel</vendor>
<topology sockets='2' cores='1' threads='1'/>
</cpu>
....
3. check cmd line
....
/usr/libexec/qemu-kvm -name rhel7.2 -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Haswell-noTSX,+abm,+pdpe1gb,+rdrand,+f16c,+osxsave,+dca,+pdcm,+xtpr,+tm2,+est,+smx,+vmx,+ds_cpl,+monitor,+dtes64,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1
....
4. upgrade libvirt to libvirt-3.2.0-14.el7_4.4.x86_64
5. check guest xml file
....
<cpu mode='custom' match='exact' check='full'>
<model fallback='allow'>Haswell-noTSX</model>
<vendor>Intel</vendor>
<topology sockets='2' cores='1' threads='1'/>
<feature policy='require' name='vme'/>
<feature policy='disable' name='ds'/>
<feature policy='disable' name='acpi'/>
<feature policy='require' name='ss'/>
<feature policy='disable' name='ht'/>
<feature policy='disable' name='tm'/>
<feature policy='disable' name='pbe'/>
<feature policy='disable' name='dtes64'/>
<feature policy='disable' name='monitor'/>
<feature policy='disable' name='ds_cpl'/>
<feature policy='disable' name='vmx'/>
<feature policy='disable' name='smx'/>
<feature policy='disable' name='est'/>
<feature policy='disable' name='tm2'/>
<feature policy='disable' name='xtpr'/>
<feature policy='disable' name='pdcm'/>
<feature policy='disable' name='dca'/>
<feature policy='disable' name='osxsave'/>
<feature policy='require' name='f16c'/>
<feature policy='require' name='rdrand'/>
<feature policy='disable' name='arat'/>
<feature policy='disable' name='tsc_adjust'/>
<feature policy='require' name='xsaveopt'/>
<feature policy='require' name='pdpe1gb'/>
<feature policy='require' name='abm'/>
<feature policy='require' name='hypervisor'/>
</cpu>
....
6. migrate to another host(libvirt-3.2.0-14.el7_4.4.x86_64)
# virsh migrate --live rhel7.2 qemu+ssh://$target_host/system --verbose
Migration: [100 %]
we need do some regression testing for this, after that,i will change status
to verified.
Hi Jirka,
I met a problem when test this bug with custom cpu model:
1. prepare a custom cpu model on a machine have cmt flags
# virsh dumpxml mig1
<cpu mode='custom' match='minimum'>
<model fallback='allow'>Haswell-noTSX</model>
<feature policy='disable' name='invtsc'/>
</cpu>
2. start guest:
# virsh start mig1
Domain mig1 started
3. update libvirt from libvirt-1.2.17-13.el7_2.6.x86_64 to libvirt-3.2.0-14.el7_4.4.x86_64
4.
# virsh dumpxml mig1 --migratable > /tmp/mig1.xml
5. migrate to target host(libvirt-3.2.0-14.el7_4.4.x86_64)
# virsh migrate mig1 qemu+ssh://$target_host/system --live --xml /tmp/mig1.xml
error: internal error: qemu unexpectedly closed the monitor: 2017-11-13T08:32:42.966170Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/3 (label charserial0)
2017-11-13T08:32:42.968141Z qemu-kvm: can't apply global Haswell-noTSX-x86_64-cpu.cmt=on: Property '.cmt' not found
Could you please help to check if this is a problem worth to fix in the 7.4.z ?
Thanks a lot for your reply.
(In reply to Luyao Huang from comment #7) > 3. update libvirt from libvirt-1.2.17-13.el7_2.6.x86_64 to > libvirt-3.2.0-14.el7_4.4.x86_64 > > 4. # virsh dumpxml mig1 --migratable > /tmp/mig1.xml Are you sure you did this after upgrading libvirt and when the new libvirt was actually running? I can reproduce the issue only if I run this command *before* upgrading libvirt, which is expected. (In reply to Luyao Huang from comment #9) > In function qemuDomainFixupCPUs() the variable ret was init as 0, and there > is no other place to change this value to -1, so this function will always > return 0. Yeah, this is clearly a bug. Although pretty harmless, since it would just cause qemuDomainFixupCPUs to report success in an unlikely OOM condition, but some other function later will likely report the error anyway. And even if no OOM was reported, the domain would just be left with the cmt feature and fixed next time. (In reply to Jiri Denemark from comment #10) > (In reply to Luyao Huang from comment #7) > > 3. update libvirt from libvirt-1.2.17-13.el7_2.6.x86_64 to > > libvirt-3.2.0-14.el7_4.4.x86_64 > > > > 4. # virsh dumpxml mig1 --migratable > /tmp/mig1.xml > > Are you sure you did this after upgrading libvirt and when the new libvirt > was > actually running? I can reproduce the issue only if I run this command > *before* upgrading libvirt, which is expected. > Yes, i did this after update libvirt, also i found this problem is not related to the --xml options, migration still got failure after drop this option: # virsh migrate mig1 qemu+ssh://$target_host/system --live error: internal error: qemu unexpectedly closed the monitor: 2017-11-14T01:23:00.168327Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/3 (label charserial0) 2017-11-14T01:23:00.170506Z qemu-kvm: can't apply global Haswell-noTSX-x86_64-cpu.cmt=on: Property '.cmt' not found > > (In reply to Luyao Huang from comment #9) > > In function qemuDomainFixupCPUs() the variable ret was init as 0, and there > > is no other place to change this value to -1, so this function will always > > return 0. > > Yeah, this is clearly a bug. Although pretty harmless, since it would just > cause qemuDomainFixupCPUs to report success in an unlikely OOM condition, but > some other function later will likely report the error anyway. And even if no > OOM was reported, the domain would just be left with the cmt feature and > fixed > next time. Okay, got it, thanks. OK, I see the problem with <cpu mode='custom' match='minimum'> now. It should already be fixed upstream and it would require backporting at least 10 more patches. Since this kind of CPU configuration likely rare (it shared the same issues we had with host-model, but we got no bug reports), I think it's better to not backport the fixes. However, there seems to be an additional small issue with <cpu mode='custom' match='minimum'> when libvirt connects to a running domain started by old libvirt. So please, file a separate bug for this issue. (In reply to Jiri Denemark from comment #14) > OK, I see the problem with <cpu mode='custom' match='minimum'> now. It should > already be fixed upstream and it would require backporting at least 10 more > patches. Since this kind of CPU configuration likely rare (it shared the same > issues we had with host-model, but we got no bug reports), I think it's > better > to not backport the fixes. > > However, there seems to be an additional small issue with <cpu mode='custom' > match='minimum'> when libvirt connects to a running domain started by old > libvirt. So please, file a separate bug for this issue. Okay, got it, i have retested this issue with libvirt-3.9.0-2.el7.x86_64, and didn't hit this issue. per comment 6,14,16 move this bug to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3324 |