Bug 1134256
Summary: | Vdsm returns successfully even though hotunplugNic does not complete in libvirt | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | akotov |
Component: | vdsm | Assignee: | Marcin Mirecki <mmirecki> |
Status: | CLOSED ERRATA | QA Contact: | Michael Burman <mburman> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.4.1-1 | CC: | alkaplan, bazulay, danken, iheim, jbuchta, knoel, lpeer, lsurette, mburman, myakove, rbalakri, Rhev-m-bugs, sbonazzo, ycui, yeylon, ykaul, ylavi |
Target Milestone: | ovirt-3.6.2 | ||
Target Release: | 3.6.2 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-03-09 19:24:48 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
akotov
2014-08-27 07:58:47 UTC
(In reply to akotov from comment #0) > > I think it is technically possible without OS booted and we should do it. > Example: > > - power off same VM with NIC unplugged. > - power on VM, still no OS booted > - plug-in NIC. It works. Hot(un)plug requires guest support and participation. How can you tell the plug-in worked? Most probably, qemu is still waiting (in vain) to guest cooperation. A following attempt to unplug/plug the device would show the error. I do think that Engine should suggest a possible reason for plug/unplug failure; something along the lines: An error occurred, but the cause is unknown. (do you have a guest operating system that supports hot-plugging?) Another (a bit more complex, and harming backward compatibility) idea is to limit hot(un)plug to VMs with running guest OS. I don't mind extending the error message with a warning that the guest must have a compatible OS installed. Meni, can you please try and reproduce this issue (with guest OS , and without ...) mburman please try to reproduce this issue. When VM running without OS installed, it is possible to add and hot plug nic's to VM without problem, it is possible to hotunplug the nic's as well, but.. If trying to hotplug back nic that just been unplugged, operation will fail with error: Error while executing action Edit VM Interface properties: Failed to activate VM Network Interface. - Because of the attempt to use double PCI slot. engine.log 2015-05-04 14:06:35,908 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-11) [5a07daa1] Correlation ID: 5a07daa1, Call Stack: null, Custom Event ID: -1, Message: Failed to plug Network Interface nic1 (VirtIO) to VM hotplug_test_no_OS. (User: admin@internal) 2015-05-04 14:06:35,984 INFO [org.ovirt.engine.core.bll.network.vm.UpdateVmInterfaceCommand] (ajp--127.0.0.1-8702-11) [5a07daa1] Command [id=4819fa4e-5db9-4406-8011-8c2f6e287acf]: Compensating DELETED_OR_UPDATED_ENTITY of org.ovirt.engine.core.common.businessentities.network.VmNic; snapshot: id=4125c295-3b26-4e29-a9f6-8e8b15ca7151. 2015-05-04 14:06:36,035 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-11) [5a07daa1] Correlation ID: 14705b01, Call Stack: null, Custom Event ID: -1, Message: Failed to update Interface nic1 (VirtIO) for VM hotplug_test_no_OS. (User: admin@internal) vdsm.log Thread-153315::INFO::2015-05-04 14:06:39,021::vm::2297::vm.Vm::(hotplugNic) vmId=`1c5cf02a-b32c-4679-a4fe-7fba119086a7`::Hotplug NIC xml: <interface type="bridge"> <address bus="0x00" domain="0x0000" function="0x0" slot="0x03" type="pci"/> <mac address="00:00:00:01:00:02"/> <model type="virtio"/> <source bridge="ovirtmgmt"/> <link state="up"/> <bandwidth/> </interface> JsonRpc (StompReactor)::DEBUG::2015-05-04 14:06:39,090::stompReactor::94::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2015-05-04 14:06:39,093::__init__::482::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-153316::DEBUG::2015-05-04 14:06:39,094::stompReactor::158::yajsonrpc.StompServer::(send) Sending response Thread-153315::ERROR::2015-05-04 14:06:39,120::vm::2302::vm.Vm::(hotplugNic) vmId=`1c5cf02a-b32c-4679-a4fe-7fba119086a7`::Hotplug failed Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 2300, in hotplugNic self._dom.attachDevice(nicXml) File "/usr/share/vdsm/virt/vm.py", line 617, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 126, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 500, in attachDevice if ret == -1: raise libvirtError ('virDomainAttachDevice() failed', dom=self) libvirtError: internal error: Attempted double use of PCI slot 0000:00:03.0 (may need "multifunction='on'" for device on function 0) See also - https://bugzilla.redhat.com/show_bug.cgi?id=1159729 RFE i opened regarding this issue. and this- https://bugzilla.redhat.com/show_bug.cgi?id=1168288 When hot-plugging fails, we should warn the end user that the failure may have happened due to a former failure to unplug. Maybe something like "... hot (un)plugged may have failed due to non-cooperating guest operating system" Tested on 3.6.2-0.1.el6 and failedQA. - Failing to hotunplug vNIC while the VM has no OS with error: 'Error while executing action Edit VM Interface properties: Failed to deactivate VM Network Interface.' This scenario is covered, but, bad things start to happen and to break when failing with such operation and then migrating this VM to second host. - Next scenario is 100% reproducible and should be covered. a. Run VM with 1 vNIC, without OS system on it and Network is part of boot order. 2015-12-28 08:44:43,813 ERROR [org.ovirt.engine.core.vdsbroker.VmsMonitoring] (DefaultQuartzScheduler_Worker-79) [77a1e10f] VM '5a4f6180-cc65-48fe-94d4-f4e7e86a2406' managed non pluggable device was removed unexpectedly from libvirt: 'VmDevice:{id='VmDeviceId:{deviceId='6dc1bd95-c938-4beb-99f1-fb1883c84156', vmId='5a4f6180-cc65-48fe-94d4-f4e7e86a2406'}', device='spice', type='GRAPHICS', bootOrder='0', specParams='[]', address='', managed='true', plugged='false', readOnly='false', deviceAlias='', customProperties='[]', snapshotId='null', logicalName='null', usingScsiReservation='false'}' b. Fail to hotunplug the vNIC with error: 'Error while executing action Edit VM Interface properties: Failed to deactivate VM Network Interface.' c. Migrate VM to second host with success d. Once migration finished vNIC reported as unplugged e. Hotplug back this vNIC with success and migrate VM back to first host f. Try to migrate the VM again to second host Result --> Migration failed and VM got suddenly shut down 2015-12-28 08:50:11,986 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (DefaultQuartzScheduler_Worker-26) [b09b28e] VM '5a4f6180-cc65-48fe-94d4-f4e7e86a2406(no_os_vm) is running in db and not running in VDS 'silver-vdsb.qa.lab.tlv.redhat.com' 2015-12-28 08:50:11,986 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] (DefaultQuartzScheduler_Worker-26) [b09b28e] add VM 'no_os_vm' to rerun treatment 2015-12-28 08:50:12,243 ERROR [org.ovirt.engine.core.vdsbroker.VmsMonitoring] (DefaultQuartzScheduler_Worker-26) [b09b28e] Rerun VM '5a4f6180-cc65-48fe-94d4-f4e7e86a2406'. Called from VDS 'silver-vdsb.qa.lab.tlv.redhat.com' 2015-12-28 08:50:12,292 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-22) [b09b28e] Correlation ID: 2b48a5e1, Job ID: 67b12f07-4bdc-4ea4-8516-54d1e1f79bab, Call Stack: null, Custom Event ID: -1, Message: Migration failed (VM: no_os_vm, Source: silver-vdsa.qa.lab.tlv.redhat.com). 2015-12-28 08:50:12,303 INFO [org.ovirt.engine.core.bll.ProcessDownVmCommand] (org.ovirt.thread.pool-7-thread-17) [1639ca7c] Running command: ProcessDownVmCommand internal: true. 2015-12-28 08:50:12,334 INFO [org.ovirt.engine.core.bll.MigrateVmCommand] (org.ovirt.thread.pool-7-thread-22) [b09b28e] Lock freed to object 'EngineLock:{exclusiveLocks='[5a4f6180-cc65-48fe-94d4-f4e7e86a2406=<VM, ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName no_os_vm>]', sharedLocks='null'}' - First host : Thread-33482::ERROR::2015-12-28 08:49:56,427::vm::758::virt.vm:_startUnderlyingVm) vmId=`5a4f6180-cc65-48fe-94d4-f4e7e86a2406`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 721, in _startUnderlyingVm self._completeIncomingMigration() File "/usr/share/vdsm/virt/vm.py", line 2817, in _completeIncomingMigration self._domDependentInit() File "/usr/share/vdsm/virt/vm.py", line 1787, in _domDependentInit self._getUnderlyingVmDevicesInfo() File "/usr/share/vdsm/virt/vm.py", line 1717, in _getUnderlyingVmDevicesInfo self._getUnderlyingNetworkInterfaceInfo() File "/usr/share/vdsm/virt/vm.py", line 4459, in _getUnderlyingNetworkInterfaceInfo if nic.macAddr.lower() == mac.lower(): AttributeError: macAddr - Second host: periodic/5::ERROR::2015-12-28 08:50:06,937::executor::188::Executor:_execute_task) Unhandled exception in <virt.periodic.NumaInfoMonitor object at 0x2b7cd10> Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 186, in _execute_task callable() File "/usr/share/vdsm/virt/periodic.py", line 271, in __call__ self._execute() File "/usr/share/vdsm/virt/periodic.py", line 311, in _execute self._vm.updateNumaInfo() File "/usr/share/vdsm/virt/vm.py", line 5056, in updateNumaInfo self._numaInfo = numaUtils.getVmNumaNodeRuntimeInfo(self) File "/usr/share/vdsm/numaUtils.py", line 106, in getVmNumaNodeRuntimeInfo _get_vcpu_positioning(vm)) File "/usr/share/vdsm/numaUtils.py", line 129, in _get_vcpu_positioning return vm._dom.vcpus()[0] File "/usr/share/vdsm/virt/virdomain.py", line 68, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 124, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2751, in vcpus if ret == -1: raise libvirtError ('virDomainGetVcpus() failed', dom=self) libvirtError: Domain not found: no domain with matching uuid '5a4f6180-cc65-48fe-94d4-f4e7e86a2406' (no_os_vm) (In reply to Michael Burman from comment #10) > Tested on 3.6.2-0.1.el6 and failedQA. > > - Failing to hotunplug vNIC while the VM has no OS with error: > 'Error while executing action Edit VM Interface properties: Failed to > deactivate VM Network Interface.' > This scenario is covered, but, bad things start to happen and to break when > failing with such operation and then migrating this VM to second host. > > - Next scenario is 100% reproducible and should be covered. > > a. Run VM with 1 vNIC, without OS system on it and Network is part of boot > order. > This error should be below step d and not a > 2015-12-28 08:44:43,813 ERROR > [org.ovirt.engine.core.vdsbroker.VmsMonitoring] > (DefaultQuartzScheduler_Worker-79) [77a1e10f] VM > '5a4f6180-cc65-48fe-94d4-f4e7e86a2406' managed non pluggable device was > removed unexpectedly from libvirt: > 'VmDevice:{id='VmDeviceId:{deviceId='6dc1bd95-c938-4beb-99f1-fb1883c84156', > vmId='5a4f6180-cc65-48fe-94d4-f4e7e86a2406'}', device='spice', > type='GRAPHICS', bootOrder='0', specParams='[]', address='', managed='true', > plugged='false', readOnly='false', deviceAlias='', customProperties='[]', > snapshotId='null', logicalName='null', usingScsiReservation='false'}' > > > b. Fail to hotunplug the vNIC with error: > 'Error while executing action Edit VM Interface properties: Failed to > deactivate VM Network Interface.' > c. Migrate VM to second host with success > d. Once migration finished vNIC reported as unplugged > e. Hotplug back this vNIC with success and migrate VM back to first host > f. Try to migrate the VM again to second host > Result --> Migration failed and VM got suddenly shut down > > 2015-12-28 08:50:11,986 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] > (DefaultQuartzScheduler_Worker-26) [b09b28e] VM > '5a4f6180-cc65-48fe-94d4-f4e7e86a2406(no_os_vm) is running in db and not > running in VDS 'silver-vdsb.qa.lab.tlv.redhat.com' > 2015-12-28 08:50:11,986 INFO [org.ovirt.engine.core.vdsbroker.VmAnalyzer] > (DefaultQuartzScheduler_Worker-26) [b09b28e] add VM 'no_os_vm' to rerun > treatment > 2015-12-28 08:50:12,243 ERROR > [org.ovirt.engine.core.vdsbroker.VmsMonitoring] > (DefaultQuartzScheduler_Worker-26) [b09b28e] Rerun VM > '5a4f6180-cc65-48fe-94d4-f4e7e86a2406'. Called from VDS > 'silver-vdsb.qa.lab.tlv.redhat.com' > 2015-12-28 08:50:12,292 ERROR > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] > (org.ovirt.thread.pool-7-thread-22) [b09b28e] Correlation ID: 2b48a5e1, Job > ID: 67b12f07-4bdc-4ea4-8516-54d1e1f79bab, Call Stack: null, Custom Event ID: > -1, Message: Migration failed (VM: no_os_vm, Source: > silver-vdsa.qa.lab.tlv.redhat.com). > 2015-12-28 08:50:12,303 INFO > [org.ovirt.engine.core.bll.ProcessDownVmCommand] > (org.ovirt.thread.pool-7-thread-17) [1639ca7c] Running command: > ProcessDownVmCommand internal: true. > 2015-12-28 08:50:12,334 INFO [org.ovirt.engine.core.bll.MigrateVmCommand] > (org.ovirt.thread.pool-7-thread-22) [b09b28e] Lock freed to object > 'EngineLock:{exclusiveLocks='[5a4f6180-cc65-48fe-94d4-f4e7e86a2406=<VM, > ACTION_TYPE_FAILED_VM_IS_BEING_MIGRATED$VmName no_os_vm>]', > sharedLocks='null'}' > > - First host : > Thread-33482::ERROR::2015-12-28 > 08:49:56,427::vm::758::virt.vm:_startUnderlyingVm) > vmId=`5a4f6180-cc65-48fe-94d4-f4e7e86a2406`::The vm start process failed > Traceback (most recent call last): > File "/usr/share/vdsm/virt/vm.py", line 721, in _startUnderlyingVm > self._completeIncomingMigration() > File "/usr/share/vdsm/virt/vm.py", line 2817, in _completeIncomingMigration > self._domDependentInit() > File "/usr/share/vdsm/virt/vm.py", line 1787, in _domDependentInit > self._getUnderlyingVmDevicesInfo() > File "/usr/share/vdsm/virt/vm.py", line 1717, in > _getUnderlyingVmDevicesInfo > self._getUnderlyingNetworkInterfaceInfo() > File "/usr/share/vdsm/virt/vm.py", line 4459, in > _getUnderlyingNetworkInterfaceInfo > if nic.macAddr.lower() == mac.lower(): > AttributeError: macAddr > > - Second host: > periodic/5::ERROR::2015-12-28 > 08:50:06,937::executor::188::Executor:_execute_task) Unhandled exception in > <virt.periodic.NumaInfoMonitor object at 0x2b7cd10> > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/executor.py", line 186, in > _execute_task > callable() > File "/usr/share/vdsm/virt/periodic.py", line 271, in __call__ > self._execute() > File "/usr/share/vdsm/virt/periodic.py", line 311, in _execute > self._vm.updateNumaInfo() > File "/usr/share/vdsm/virt/vm.py", line 5056, in updateNumaInfo > self._numaInfo = numaUtils.getVmNumaNodeRuntimeInfo(self) > File "/usr/share/vdsm/numaUtils.py", line 106, in getVmNumaNodeRuntimeInfo > _get_vcpu_positioning(vm)) > File "/usr/share/vdsm/numaUtils.py", line 129, in _get_vcpu_positioning > return vm._dom.vcpus()[0] > File "/usr/share/vdsm/virt/virdomain.py", line 68, in f > ret = attr(*args, **kwargs) > File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line > 124, in wrapper > ret = f(*args, **kwargs) > File "/usr/lib64/python2.7/site-packages/libvirt.py", line 2751, in vcpus > if ret == -1: raise libvirtError ('virDomainGetVcpus() failed', dom=self) > libvirtError: Domain not found: no domain with matching uuid > '5a4f6180-cc65-48fe-94d4-f4e7e86a2406' (no_os_vm) Verified on - 3.6.2.5-0.1.el6 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0362.html |