Bug 1502768 - HA VM (lease) power off fails while the VM is Paused
Summary: HA VM (lease) power off fails while the VM is Paused
Keywords:
Status: CLOSED DUPLICATE of bug 1524119
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.20.3
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.2.2
: ---
Assignee: Dan Kenigsberg
QA Contact: Raz Tamir
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-16 16:04 UTC by Elad
Modified: 2018-01-08 10:10 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-01-08 10:10:01 UTC
oVirt Team: Virt
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)
logs from engine and hypervisor (4.77 MB, application/x-gzip)
2017-10-16 16:04 UTC, Elad
no flags Details

Description Elad 2017-10-16 16:04:01 UTC
Created attachment 1339322 [details]
logs from engine and hypervisor

Description of problem:
HA VM power off, with lease on NFS storage, fails on the following exception after it has been paused after disconnection and connection restore between the host and the storage where the lease and the VM disk reside.

2017-10-16 14:44:31,471+0300 ERROR (jsonrpc/7) [api] FINISH destroy error=VM '12f923ff-7a3e-480a-b67c-dbc559770ddb' was not defined yet or was undefined (api:127)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 117, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 312, in destroy
    res = self.vm.destroy(gracefulAttempts)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4944, in destroy
    self._deleteVm()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 4933, in _deleteVm
    self._undefine_domain()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2259, in _undefine_domain
    self._dom.undefine()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 47, in __getattr__
    % self.vmid)
NotConnectedError: VM '12f923ff-7a3e-480a-b67c-dbc559770ddb' was not defined yet or was undefined
2017-10-16 14:44:31,478+0300 INFO  (jsonrpc/7) [api.virt] FINISH destroy return={'status': {'message': 'General Exception: ("VM \'12f923ff-7a3e-480a-b67c-dbc559770ddb\' was not defined yet 
or was undefined",)', 'code': 100}} from=::ffff:10.35.161.182,35806, flow_id=cf15a7af-c06d-4d7b-a007-3e68cd57b4a6 (api:52)



Version-Release number of selected component (if applicable):
vdsm-4.20.3-175.git76c0aff.el7.centos.x86_64
libvirt-daemon-3.2.0-14.el7_4.3.x86_64
sanlock-3.5.0-1.el7.x86_64
ovirt-engine-4.2.0-0.0.master.20171013142622.git15e767c.el7.centos.noarch

How reproducible:
Always on RHV automation test case https://polarion.engineering.redhat.com/polarion/redirect/project/RHEVM3/workitem?id=RHEVM-17621

Steps to Reproduce:
1. Create new HA VM with storage lease
2. Start the VM
3. Block connection from all hosts in the DC to the storage domain.
4. Block connection from engine to the host -> VM will become UNKNOWN and
won't failover to another host
5. Power off the VM 

Actual results:
Power off VM fails on the mentioned exception

Expected results:
VM power off should succeed

Additional info:

engine.log:

b4a6] Command 'DestroyVDSCommand(HostName = host_mixed_2, DestroyVmVDSCommandParameters:{hostId='3ebf2639-0e23-4d8f-85da-cdb4d427d30d', vmId='12f923ff-7a3e-480a-b67c-dbc559770ddb', secondsT
oWait='0', gracefully='false', reason='', ignoreNoVm='false'})' execution failed: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = General Exception: ("VM '12f923ff-7a3
e-480a-b67c-dbc559770ddb' was not defined yet or was undefined",), code = 100


libvirtd.log from around the same time:

2017-10-16 11:44:22.789+0000: 1476: error : qemuOpenFileAs:3176 : Failed to open file '/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge2__nfs__2/e552
5ed6-0970-49da-b022-5d9c342d928b/images/97bfb9f6-850a-4bc7-97b6-2ed78e88a54d/c5c20431-d79d-4b95-b8b2-6685211716bb': No such file or directory
2017-10-16 11:44:22.789+0000: 1476: error : qemuDomainStorageOpenStat:11478 : cannot stat file '/rhev/data-center/mnt/yellow-vdsb.qa.lab.tlv.redhat.com:_Storage__NFS_storage__local__ge2__nf
s__2/e5525ed6-0970-49da-b022-5d9c342d928b/images/97bfb9f6-850a-4bc7-97b6-2ed78e88a54d/c5c20431-d79d-4b95-b8b2-6685211716bb': Bad file descriptor

Comment 1 Nir Soffer 2018-01-07 17:10:44 UTC
Powering of vms is not a storage flow, moving to virt.

Francesco, can you take a look?

Comment 2 Francesco Romani 2018-01-08 10:10:01 UTC
Should be fixed by https://bugzilla.redhat.com/show_bug.cgi?id=1524119, more specifically by commit 764aa7a15d576652388c0d98639b3d8b8ec9005c

*** This bug has been marked as a duplicate of bug 1524119 ***


Note You need to log in before you can comment on or make changes to this bug.