Created attachment 1339322 [details] logs from engine and hypervisor Description of problem: HA VM power off, with lease on NFS storage, fails on the following exception after it has been paused after disconnection and connection restore between the host and the storage where the lease and the VM disk reside. 2017-10-16 14:44:31,471+0300 ERROR (jsonrpc/7) [api] FINISH destroy error=VM '12f923ff-7a3e-480a-b67c-dbc559770ddb' was not defined yet or was undefined (api:127) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 117, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 312, in destroy res = self.vm.destroy(gracefulAttempts) File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4944, in destroy self._deleteVm() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4933, in _deleteVm self._undefine_domain() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2259, in _undefine_domain self._dom.undefine() File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 47, in __getattr__ % self.vmid) NotConnectedError: VM '12f923ff-7a3e-480a-b67c-dbc559770ddb' was not defined yet or was undefined 2017-10-16 14:44:31,478+0300 INFO (jsonrpc/7) [api.virt] FINISH destroy return={'status': {'message': 'General Exception: ("VM \'12f923ff-7a3e-480a-b67c-dbc559770ddb\' was not defined yet or was undefined",)', 'code': 100}} from=::ffff:10.35.161.182,35806, flow_id=cf15a7af-c06d-4d7b-a007-3e68cd57b4a6 (api:52) Version-Release number of selected component (if applicable): vdsm-4.20.3-175.git76c0aff.el7.centos.x86_64 libvirt-daemon-3.2.0-14.el7_4.3.x86_64 sanlock-3.5.0-1.el7.x86_64 ovirt-engine-4.2.0-0.0.master.20171013142622.git15e767c.el7.centos.noarch How reproducible: Always on RHV automation test case https://polarion.engineering.redhat.com/polarion/redirect/project/RHEVM3/workitem?id=RHEVM-17621 Steps to Reproduce: 1. Create new HA VM with storage lease 2. Start the VM 3. Block connection from all hosts in the DC to the storage domain. 4. Block connection from engine to the host -> VM will become UNKNOWN and won't failover to another host 5. Power off the VM Actual results: Power off VM fails on the mentioned exception Expected results: VM power off should succeed Additional info: engine.log: b4a6] Command 'DestroyVDSCommand(HostName = host_mixed_2, DestroyVmVDSCommandParameters:{hostId='3ebf2639-0e23-4d8f-85da-cdb4d427d30d', vmId='12f923ff-7a3e-480a-b67c-dbc559770ddb', secondsT oWait='0', gracefully='false', reason='', ignoreNoVm='false'})' execution failed: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = General Exception: ("VM '12f923ff-7a3 e-480a-b67c-dbc559770ddb' was not defined yet or was undefined",), code = 100 libvirtd.log from around the same time: 2017-10-16 11:44:22.789+0000: 1476: error : qemuOpenFileAs:3176 : Failed to open file '/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge2__nfs__2/e552 5ed6-0970-49da-b022-5d9c342d928b/images/97bfb9f6-850a-4bc7-97b6-2ed78e88a54d/c5c20431-d79d-4b95-b8b2-6685211716bb': No such file or directory 2017-10-16 11:44:22.789+0000: 1476: error : qemuDomainStorageOpenStat:11478 : cannot stat file '/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge2__nf s__2/e5525ed6-0970-49da-b022-5d9c342d928b/images/97bfb9f6-850a-4bc7-97b6-2ed78e88a54d/c5c20431-d79d-4b95-b8b2-6685211716bb': Bad file descriptor
Powering of vms is not a storage flow, moving to virt. Francesco, can you take a look?
Should be fixed by https://bugzilla.redhat.com/show_bug.cgi?id=1524119, more specifically by commit 764aa7a15d576652388c0d98639b3d8b8ec9005c *** This bug has been marked as a duplicate of bug 1524119 ***