Bug 1625591

Summary: After importing KVM VM, removing the VM and re-importing, the re-importing fails
Product: Red Hat Enterprise Virtualization Manager Reporter: Steven Rosenberg <srosenbe>
Component: vdsmAssignee: Steven Rosenberg <srosenbe>
Status: CLOSED CURRENTRELEASE QA Contact: meital avital <mavital>
Severity: medium Docs Contact: Rolfe Dlugy-Hegwer <rdlugyhe>
Priority: medium    
Version: 4.2.0CC: gveitmic, lsurette, mavital, mtessun, rbarry, rdlugyhe, smaudet, srevivo, ycui
Target Milestone: ovirt-4.3.0Flags: lsvaty: testing_plan_complete-
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.3.0_rc Doc Type: Rebase: Bug Fixes Only
Doc Text:
Previously, after importing and removing a Kernel-based Virtual Machine (KVM), trying to re-import the same virtual machine failed with a "Job ID already exists" error. The current release deletes completed import jobs from the VDSM. You can re-import a virtual machine without encountering the same error.
Story Points: ---
Clone Of:
: 1639742 (view as bug list) Environment:
Last Closed: 2019-03-22 14:47:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1639742    
Attachments:
Description Flags
vdsm.log none

Description Steven Rosenberg 2018-09-05 09:45:12 UTC
Created attachment 1481038 [details]
vdsm.log

Description of problem:

After importing a KVM VM successfully, then removing the VM successfully attempting to re-import the same VM fails. 

Version-Release number of selected component (if applicable):


How reproducible:

Always


Steps to Reproduce:
1. Import a KVM Virtual Machine to oVirt.
2. Remove the Virtual Machine from oVirt.
3. Attempt to re-import the same Virtual Machine

Actual results:

The second import fails due to an error that the Job ID already exists. The Job Id is the VM Id sent from the engine to the vdsm both when instantiating the import and when deleting the VM from the vdsm's list created by _add_job routine.

The error in the vdsm log is here:

2018-09-05 12:12:44,496+0300 ERROR (jsonrpc/1) [jsonrpc.JsonRpcServer] Internal server error (__init__:611)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 606, in _handle_request
    res = method(**params)
  File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 201, in _dynamicMethod
    result = fn(*methodArgs)
  File "<string>", line 2, in convertExternalVm
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 48, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1485, in convertExternalVm
    self._irs)
  File "/usr/lib/python2.7/site-packages/vdsm/v2v.py", line 222, in convert_external_vm
    _add_job(job_id, job)
  File "/usr/lib/python2.7/site-packages/vdsm/v2v.py", line 304, in _add_job
    raise JobExistsError("Job %r exists" % job_id)
JobExistsError: Job u'dc5c81ff-35a8-4b90-82aa-b316ebeac65e' exists


This error also occurs later in the log:

2018-09-05 12:12:47,910+0300 ERROR (tasks/2) [storage.TaskManager.Task] (Task='fc8b7bf8-8732-4558-b34a-5fdf1ad12cda') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 336, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 1986, in purgeImage
    domain.purgeImage(sdUUID, imgUUID, volsByImg, discard)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 710, in purgeImage
    self._manifest.purgeImage(sdUUID, imgUUID, volsImgs, discard)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/fileSD.py", line 261, in purgeImage
    raise se.ImageDeleteError("%s %s" % (imgUUID, str(e)))
ImageDeleteError: Could not remove all image's volumes: (u'502e0a88-0918-42ba-aad7-94228bebd39e [Errno 39] Directory not empty',)

Note that the deleteV2VJob function is called after the Job exists error:

2018-09-05 12:12:45,584+0300 INFO  (jsonrpc/0) [api.host] START deleteV2VJob(jobid=u'dc5c81ff-35a8-4b90-82aa-b316ebeac65e') from=::ffff:10.35.1.148,40530, flow_id=0852c0ce-26bc-4a57-bcc2-112f1f30f9d9 (api:46)



Expected results:

The Import should succeed as it did the first time


Additional info:

Attached vdsm log

Note, the problem may be in the engine code which initiates the call to the deleteV2VJob function.

Comment 1 Germano Veit Michel 2018-09-27 01:48:59 UTC
Looks like the engine is not doing DeleteV2VJobVDSCommand on some situations. See for example flow_id=f2de7633-5cd4-4751-8650-cdd1aa1d3df2 on attached logs.

It did not reproduce on my env with exact same versions though. Initially I thought it would only happen on 2+ disks, but it doesn't seem to be the case.

Comment 4 Germano Veit Michel 2018-09-27 02:22:53 UTC
There are 2 more problems it seems:

1) The job is finally deleted by the engine when the second convertExternalVm is rejected by VDSM (JobExistsError)
2) VDSM returns an error to the engine, but the second convertExternalVm() job actually runs and completes.

Comment 5 meital avital 2018-11-28 15:58:50 UTC
Verified on:
ovirt-engine-4.3.0-0.2.master.20181121071050.gita8fcd23.el7.noarch
qemu-kvm-ev-2.10.0-21.el7_5.7.1.x86_64
vdsm-4.30.3-10.git284ece2.el7.x86_64
libvirt-client-4.5.0-10.el7.x86_64

Verification steps:
1. Import VM from KVM.
2. Remove VM
3. Import same VM from KVM again. 
4. Verify VM imported and is running successfully.
5. Repeat steps 1-4 few times.

Comment 6 RHV bug bot 2018-12-10 15:13:46 UTC
WARN: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops

Comment 7 RHV bug bot 2019-01-15 23:36:21 UTC
WARN: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops: Bug status (VERIFIED) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops

Comment 10 Daniel Gur 2019-08-28 13:14:55 UTC
sync2jira

Comment 11 Daniel Gur 2019-08-28 13:19:58 UTC
sync2jira