Bug 994364
Summary: | VIR_DOMAIN_XML_MIGRATABLE generates unmigratable XML | |||
---|---|---|---|---|
Product: | [Community] Virtualization Tools | Reporter: | Tiziano Müller <tm> | |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> | |
Status: | CLOSED UPSTREAM | QA Contact: | ||
Severity: | unspecified | Docs Contact: | ||
Priority: | unspecified | |||
Version: | unspecified | CC: | acathrow, chaochin, crobinso, dallan, hannsj_uhl, jdenemar, mprivozn, sross, tm | |
Target Milestone: | --- | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1076503 (view as bug list) | Environment: | ||
Last Closed: | 2013-10-11 15:43:50 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1076503, 1141838 | |||
Attachments: |
Description
Tiziano Müller
2013-08-07 06:42:20 UTC
Small update: if the CPU-model is indirectly specified by using "<cpu mode='host-model' match='exact'>", the error changes to the following: error: unsupported configuration: Target CPU model SandyBridge does not match source (null) If the error message is correct, it seems that the source definition used to compare the XML to be used in the migration is not the same as generated by "dumpxml --migratable" (or the corresponding API call). I was initially able to work around this bug by using --security-info instead of --migratable, but this workaround stopped working as soon as I added a second virtio-serial device for the qemu-guest-agent. In that case I got the following when using an XML generated using --security-info: Oct 3 17:07:18 foss-cloud-node-01 libvirtd: 6242: error : virDomainDeviceInfoCheckABIStability:12718 : unsupported configuration: Target device address type none does not match source pci This was also reproduced with version 1.1.2 Tiziano, is there a particular need you have that requires you to dump the XML and then specify it rather than just letting the migration pass the XML unchanged? Michal, this isn't functionality I use, so I can't say how it's intended to work, but I think it should be added to the CI tests. (In reply to Dave Allan from comment #4) > Tiziano, is there a particular need you have that requires you to dump the > XML and then specify it rather than just letting the migration pass the XML > unchanged? Yes, the spice port and the IP to bind spice to must be changed on migration. And it's our management interface which assigns the ports rather than libvirt auto-selecting it. Tiziano, can you please attach the XML of the domain you're trying to migrate? I mean without --migratable switch. Dave, right. I'll update our test suite once I'll figure out where the problem is. Created attachment 807462 [details]
XML of Test-VM, generated with "virsh dumpxml" without flags (thus "default")
This is the corresponding error I get when migrating:
Oct 4 09:06:35 foss-cloud-node-02 libvirtd: 31425: error : virDomainDeviceInfoCheckABIStability:12718 : unsupported configuration: Target device address type none does not match source pci
Created attachment 807463 [details]
XML of Test-VM, generated with "virsh dumpxml" with --migratable flag
Attaching the migratable XML as well for comparison
Could you also turn on debug logs for destination libvirtd and try migrating without changing the XML? The thing is, the XML generated with virsh dumpxml --migratable is supposed to be exactly the same as the XML sent by libvirtd during migration. So by seeing that XML in the debug logs, we can check where the two XMLs differ. just a small hint how to turn on and gather debug logs: http://wiki.libvirt.org/page/DebugLogs Oops, I guess I know what it is. While the --migratable XML is supposed to be the same as what we sent normally during migration, the xmlin definition is not checked against the migratable XML. It's checked against normal XML and thus the ABI check fails. Created attachment 807477 [details]
libvirtd.log obtained as described on the DebugLogs page
Concerning the migration without providing an XML:
* I had to change the XML definition a bit: the IP for the spice server has to be changed to 0.0.0.0 for the migration to work, besides that, the definition as given before was used
* The command used is: virsh migrate --live --p2p --tunnelled --persistent --undefinesource --compressed 59a2135b-d134-4cf1-b188-3cd72bc503dd qemu+tcp://10.1.130.13/system
* and the migration works perfectly
(In reply to Jiri Denemark from comment #12) > Oops, I guess I know what it is. While the --migratable XML is supposed to > be the same as what we sent normally during migration, the xmlin definition > is not checked against the migratable XML. It's checked against normal XML > and thus the ABI check fails. This also explains why the _source_ CPU model is (null), see my comment 1 for some VMs. Created attachment 807479 [details]
libvirtd.log obtained as described on the DebugLogs page
sorry, I made a mistake when generating the first log: our code was still using the VIR_DOMAIN_XML_SECURE flag (which I used as a workaround).
Attaching the log for the migration with an XML generated using VIR_DOMAIN_XML_MIGRATABLE which shows the earlier failure (when checking the CPU type).
Just to make it clear what our cases are/were:
* Initially we had a VM config as attached but without virtio-rng, the second virtio-serial qemu-ga channel and without CPU host-model. Then migration failed with the error given in the initial report when using a XML generated using VIR_DOMAIN_XML_MIGRATABLE but worked with VIR_DOMAIN_XML_SECURE
* Then we added the virtio-rng, virtio-serial qemu-ga channel and CPU host-model definitions, giving the same error we saw earlier with VIR_DOMAIN_XML_MIGRATABLE now also with VIR_DOMAIN_XML_SECURE (which we used as a workaround)
* When doing a migration now using a XML generated using VIR_DOMAIN_XML_MIGRATABLE we get the error about the non-matching CPU model
Patch proposed upstream: https://www.redhat.com/archives/libvir-list/2013-October/msg00322.html Even though the previous patch got ACKed, after some thinking it seems we can do better: https://www.redhat.com/archives/libvir-list/2013-October/msg00477.html So I've just pushed the patch upstream: commit 7d704812b9c50cd3804dd1e7f9e2ea3e75fdc847 Author: Michal Privoznik <mprivozn> AuthorDate: Thu Oct 10 10:53:56 2013 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Fri Oct 11 10:31:35 2013 +0200 qemu: Introduce qemuDomainDefCheckABIStability https://bugzilla.redhat.com/show_bug.cgi?id=994364 Whenever we check for ABI stability, we have new xml (e.g. provided by user, or obtained from snapshot, whatever) which we compare to old xml and see if ABI won't break. However, if the new xml was produced via virDomainGetXMLDesc(..., VIR_DOMAIN_XML_MIGRATABLE) it lacks some devices, e.g. 'pci-root' controller. Hence, the ABI stability check fails even though it is stable. Moreover, we can't simply fix virDomainDefCheckABIStability because removing the correct devices is task for the driver. For instance, qemu driver wants to remove the usb controller too, while LXC driver doesn't. That's why we need special qemu wrapper over virDomainDefCheckABIStability which removes the correct devices from domain XML, produces MIGRATABLE xml and calls the check ABI stability function. Signed-off-by: Michal Privoznik <mprivozn> Cole, do you think this is worth backporting onto maint branches and into Fedora? If not, then this bug can be moved to CLOSED NEXTRELEASE, right? If it's a trivial and safe backport to maint we might as well do it (git checkout v1.1.3-maint && git cherry-pick -x <commit> && git push origin v1.1.3-maint) but if not I say we wait for someone in Fedora to actually complain. Either way this bug is CLOSED->UPSTREAM since it's in the upstream bug tracker *** Bug 1075174 has been marked as a duplicate of this bug. *** The bug may break OpenStack instance live migration on RHEL6.5/7.0. See https://bugs.launchpad.net/nova/+bug/1362929 |