Bug 1546971
Summary: | Cannot modify vNIC profile of running VM | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Michael Burman <mburman> | ||||
Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> | ||||
Status: | CLOSED ERRATA | QA Contact: | chhu | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 7.5 | CC: | bugs, chhu, danken, dyuan, jdenemar, jferlan, jherrman, knoel, lhuang, mburman, michal.skrivanek, mkalinin, mtessun, qguo, sacpatil, salmy, yafu, ylavi | ||||
Target Milestone: | rc | Keywords: | Regression, ZStream | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-4.3.0-1.el7 | Doc Type: | Bug Fix | ||||
Doc Text: |
Previously, the check for a non-unique device boot order did not properly handle updates of existing devices when a new device was attached to a guest. Consequently, updating any device with a specified boot order failed. With this update, the duplicity check detects correctly handles updates and ignores the original device, which avoids reporting false conflicts. As a result, updating a device with a boot order succeeds.
|
Story Points: | --- | ||||
Clone Of: | |||||||
: | 1557922 (view as bug list) | Environment: | |||||
Last Closed: | 2018-10-30 09:52:39 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1557922 | ||||||
Attachments: |
|
please provider libvirt version and try with an older one. It smells like a libvirt bug. This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. (In reply to Dan Kenigsberg from comment #1) > please provider libvirt version and try with an older one. It smells like a > libvirt bug. - libvirt versions in which it is working properly - libvirt-client-3.2.0-14.el7_4.9.x86_64 (rhel 7.4) libvirt-client-2.0.0-10.el7_3.11.x86_64 (rhel 7.3) On vdsm 3.6 - vdsm-4.17.44-2.el7ev.noarch (both in engine 3.6 and RHV 4.2!!!) - libvirt version in which it doesn't works - libvirt-client-3.9.0-13.el7.x86_64 (rhel 7.5) On vdsm 4.1/4.2 (in engine 4.1/4.2!) So yes, looks like a regression on libvirt side indeed. Dan, can you moce it to the correct component please? Michael, can you share your domxml before the attempt to modify the <interface>? Indeed, broken by v3.7.0-14-gc57f3fd2f8: commit c57f3fd2f8999d17e01272246b26d31179ab7b6e Author: Michal Privoznik <mprivozn> AuthorDate: Tue Sep 5 16:24:14 2017 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Wed Sep 6 11:19:30 2017 +0200 conf: Validate device on update-device https://bugzilla.redhat.com/show_bug.cgi?id=1439991 Whenever a device is being updated via virDomainUpdateDeviceFlags() API, we parse the device XML and ideally run some generic checks to validate the configuration (e.g. if device defines per-device boot order but the domain has os/boot element already). Well, that's the theory - due to a missing check we've jumped early from that check function. Signed-off-by: Michal Privoznik <mprivozn> Reviewed-by: Erik Skultety <eskultet> (In reply to Dan Kenigsberg from comment #6) > Michael, can you share your domxml before the attempt to modify the > <interface>? <interface type='bridge'> <mac address='00:00:00:00:00:50'/> <source bridge='net-2'/> <target dev='vnet0'/> <model type='virtio'/> <filterref filter='vdsm-no-mac-spoofing'/> <boot order='2'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> Patches sent upstream for review: https://www.redhat.com/archives/libvir-list/2018-February/msg01089.html An easy reproducer: 1. start a domain with an interface device with <boot order='...'/> 2. take the <interface> element from domain XML and change it (e.g., add <link state="up"/> into it) 3. try to update the device using the modified interface XML from step 3 BTW, don't forget to verify bug 1439991 is still fixed after applying the patches for this bug. This is now fixed upstream by: commit e3497f3fc895c427ce35188f3ddb2bb47a853be9 Refs: v4.0.0-318-ge3497f3fc8 Author: Jiri Denemark <jdenemar> AuthorDate: Thu Feb 22 13:22:39 2018 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Feb 23 11:52:44 2018 +0100 lxc: Drop useless check in live device update Checking the new device definition makes little sense when lxc driver does not support live device update at all. Signed-off-by: Jiri Denemark <jdenemar> commit b6a264e8550d4add3946ec2fd9ae31a76fbf16fe Refs: v4.0.0-319-gb6a264e855 Author: Jiri Denemark <jdenemar> AuthorDate: Thu Feb 22 13:30:27 2018 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Feb 23 11:52:44 2018 +0100 Pass oldDev to virDomainDefCompatibleDevice on device update When calling virDomainDefCompatibleDevice to check a new device during device update, we need to pass the original device which is going to be updated in addition to the new device. Otherwise, the function can report false conflicts. The new argument is currently ignored by virDomainDefCompatibleDevice, but this will change in the following patch. https://bugzilla.redhat.com/show_bug.cgi?id=1546971 Signed-off-by: Jiri Denemark <jdenemar> commit edae027cfe02be4863dcef1e7f0ea0564766e312 Refs: v4.0.0-320-gedae027cfe Author: Jiri Denemark <jdenemar> AuthorDate: Thu Feb 22 13:51:36 2018 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Feb 23 11:52:44 2018 +0100 qemu: Fix updating device with boot order Commit v3.7.0-14-gc57f3fd2f8 prevented adding a <boot order='x'/> element to an inactive domain with global <boot dev='...'/> element. However, as a result of that change updating any device with boot order would fail with 'boot order X is already used by another device', where "another device" is in fact the device which is being updated. To fix this we have to ignore the device which we're about to update when checking for boot order conflicts. https://bugzilla.redhat.com/show_bug.cgi?id=1546971 Signed-off-by: Jiri Denemark <jdenemar> Test on upstream libvirt, hotplug interface will cause libvirtd crash, please help to check, Thank you! Reproduce this bug on libvirt-3.9.0-13.el7.x86_64 as comment 10: 1. start a domain with an interface device with <boot order='2'/> 2. take the <interface> element from domain XML and change it (e.g., add <link state="up"/> into it) 3. try to update the device using the modified interface XML from step 3 # virsh update-device rhel interface.xml error: Failed to update device from interface.xml error: unsupported configuration: boot order 2 is already used by another device Test on upstream libvirt-4.1.0-1.el7.x86_64 as comment 10, update succeed in step 3: # virsh update-device rhel interface.xml Device updated successfully # virsh dumpxml rhel | grep /interface -B9 <interface type='network'> <mac address='52:54:00:1b:a5:b9'/> <source network='default' bridge='virbr0'/> <target dev='vnet0'/> <model type='virtio'/> <link state='up'/> <boot order='2'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> But with libvirt-4.1.0-1.el7.x86_64, hotplug interface/device will cause libvirtd crash. # virsh attach-interface rhel network default error: Disconnected from qemu:///system due to end of file error: Failed to attach interface error: End of file while reading data: Input/output error Backtrace: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fe6b5031700 (LWP 4892)] virDomainDeviceGetInfo (device=device@entry=0x0) at conf/domain_conf.c:3565 3565 switch ((virDomainDeviceType) device->type) { (gdb) bt #0 virDomainDeviceGetInfo (device=device@entry=0x0) at conf/domain_conf.c:3565 #1 0x00007fe6c63691e6 in virDomainDefCompatibleDevice (def=0x7fe6a017d320, dev=dev@entry=0x7fe68c002890, oldDev=oldDev@entry=0x0) at conf/domain_conf.c:27420 #2 0x00007fe6a831b920 in qemuDomainAttachDeviceLiveAndConfig (flags=<optimized out>, xml=<optimized out>, driver=0x7fe6a0100f60, vm=0x7fe6a017bbd0) at qemu/qemu_driver.c:8499 #3 qemuDomainAttachDeviceFlags (dom=<optimized out>, xml=<optimized out>, flags=1) at qemu/qemu_driver.c:8558 ... And this issue is introduced by the patch which mentioned in comment 12. I think libvirt need check if &oldDev is NULL when call virDomainDeviceGetInfo() in function virDomainDefCompatibleDevice(). I tripped across the issues from Comment 15 & 16 doing something similar... See: https://www.redhat.com/archives/libvir-list/2018-March/msg00002.html This was just pushed upstream as: $ git describe 5535856f0e31aa6abf4cda11b5c53c0f164680f0 v4.1.0-rc2-8-g5535856f0 $ git show 5535856f0e31aa6abf4cda11b5c53c0f164680f0 Author: John Ferlan <jferlan> Date: Thu Mar 1 07:08:32 2018 -0500 conf: Fix crash in virDomainDefCompatibleDevice Commit id 'edae027c' blindly assumed that the passed @oldDev parameter would not be NULL when calling virDomainDeviceGetInfo; however, commit id 'b6a264e8' passed NULL for AttachDevice callers under the premise that there wouldn't be a device to check/update against. Signed-off-by: John Ferlan <jferlan> Reproduced on packages: libvirt-3.9.0-14.el7.x86_64 qemu-kvm-rhev-2.10.0-21.el7.x86_64 Verified on packages: libvirt-4.5.0-1.el7.x86_64 qemu-kvm-rhev-2.12.0-7.el7.x86_64 Test steps: 1. Update "link state" of interface - PASS(can update) 2. Update "boot order" of interface when vm is active - PASS(not support) 3. update "boot order" of interface when vm is inactive - PASS(can update) 4. Hotplug with "boot order" with os/boot elements(negative) - PASS(unsupported) 5. Hotplug with duplicate "boot order"(negative) - PASS(report error) 6. Hotplug with "boot order" - PASS 7. Hotplug without "boot order" - PASS 8. Coldplug for 4,5,6,7 - PASS 9. Update elements or attributes other than "boot order" when vm is inactive - PASS(can update) 10. Invalid "boot order"(negative) when vm is active or inactive - PASS(report error) 11. Test cover bug1439991 - PASS Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:3113 |
Created attachment 1398112 [details] Logs Description of problem: Can't set new profile for VM's vNIC - no unplug involved(working properly on 3.6). If trying to choose a new profile for the VM's vNIC we failing with: 2018-02-20 10:36:16,922+0200 WARN (jsonrpc/6) [virt.vm] (vmId='f8abc451-ad7d-4916-ae0b-fca5cfa643bb') Request failed: <?xml version='1.0' encoding='utf-8'?> <interface type="bridge"> <address bus="0x00" domain="0x0000" function="0x0" slot="0x03" type="pci" /> <mac address="00:00:00:00:00:50" /> <model type="virtio" /> <source bridge="ovirtmgmt" /> <filterref filter="vdsm-no-mac-spoofing" /> <link state="up" /> <boot order="2" /> <bandwidth /> </interface> (vm:3106) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3101, in setLinkAndNetwork libvirt.VIR_DOMAIN_AFFECT_LIVE) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2739, in updateDeviceFlags if ret == -1: raise libvirtError ('virDomainUpdateDeviceFlags() failed', dom=self) libvirtError: unsupported configuration: boot order 2 is already used by another device 2018-02-20 10:36:16,923+0200 WARN (jsonrpc/6) [virt.vm] (vmId='f8abc451-ad7d-4916-ae0b-fca5cfa643bb') Rolling back link and net for: net0 (vm:3115) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3110, in setLinkAndNetwork raise SetLinkAndNetworkError(str(e)) SetLinkAndNetworkError: unsupported configuration: boot order 2 is already used by another device 2018-02-20 10:36:16,927+0200 ERROR (jsonrpc/6) [api] FINISH updateDevice error=unsupported configuration: boot order 2 is already used by another device (api:127) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 117, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 372, in updateDevice return self.vm.updateDevice(params) File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3174, in updateDevice return self._updateInterfaceDevice(params) File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3064, in _updateInterfaceDevice custom, specParams): File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__ return self.gen.next() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 3117, in setLinkAndNetwork libvirt.VIR_DOMAIN_AFFECT_LIVE) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2739, in updateDeviceFlags if ret == -1: raise libvirtError ('virDomainUpdateDeviceFlags() failed', dom=self) libvirtError: unsupported configuration: boot order 2 is already used by another device Version-Release number of selected component (if applicable): vdsm-4.20.18-1.el7ev.x86_64 vdsm-4.19.46-1.el7ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create network n1 and attach to host 1. Run VM with 1 vNIC(ovirtmgmt) 2. Edit VM's vNIC and choose n1 profile(don't unplug prior this) 3. Now first unplug and then choose n1 profile and plug back 4. Edit the VM's vNIC again and do the same thing as on step 2, choose ovirtmgmt profile without unplugging first Actual results: 2. Failed with General Exception and libvirt error on vdsm side 4. Succeeded! Expected results: 2. Should succeed like on step 4, this operation working as expected on 3.6 for example, but not on 4.1/4.2 and it's why this is a regression on vdsm side. Additional info: on 3.6 working fine with the same flow