Bug 1459448 - Live storage migration failed due to 'Cannot get parent volume'
Summary: Live storage migration failed due to 'Cannot get parent volume'
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.1.3.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-4.1.6
: ---
Assignee: Benny Zlotnik
QA Contact: Lilach Zitnitski
URL:
Whiteboard:
Depends On: 1422508
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-06-07 08:03 UTC by Eyal Shenitzky
Modified: 2017-08-16 14:58 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-16 14:58:43 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: blocker+


Attachments (Terms of Use)
engine and vdsm logs (2.59 MB, application/x-gzip)
2017-06-07 08:03 UTC, Eyal Shenitzky
no flags Details

Description Eyal Shenitzky 2017-06-07 08:03:36 UTC
Created attachment 1285703 [details]
engine and vdsm logs

Description of problem:
Live storage migration failed when migrating VM bootable disk
due to error int the VDSM:

2017-06-07 00:50:11,233 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler9) [52d89b71] SPMAsyncTask::PollTask: Polling task '7ea3330c-a4d1-4a22-839b-ecc3adfdda5b' (Parent Command 'CreateImagePlaceholder', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'.
2017-06-07 00:50:11,236 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler9) [52d89b71] BaseAsyncTask::logEndTaskFailure: Task '7ea3330c-a4d1-4a22-839b-ecc3adfdda5b' (Parent Command 'CreateImagePlaceholder', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure:
-- Result: 'cleanSuccess'
-- Message: 'VDSGenericException: VDSErrorException: Failed in vdscommand to HSMGetAllTasksStatusesVDS, error = Cannot get parent volume',
-- Exception: 'VDSGenericException: VDSErrorException: Failed in vdscommand to HSMGetAllTasksStatusesVDS, error = Cannot get parent volume'
2017-06-07 00:50:11,237 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler9) [52d89b71] CommandAsyncTask::endActionIfNecessary: All tasks of command 'ab219753-e1e6-4b98-a1ef-a8f31cc38f71' has ended -> executing 'endAction'
2017-06-07 00:50:11,237 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (DefaultQuartzScheduler9) [52d89b71] CommandAsyncTask::endAction: Ending action for '1' tasks (command ID: 'ab219753-e1e6-4b98-a1ef-a8f31cc38f71'): calling endAction '.
2017-06-07 00:50:11,238 INFO  [org.ovirt.engine.core.bll.tasks.CommandAsyncTask] (org.ovirt.thread.pool-6-thread-6) [52d89b71] CommandAsyncTask::endCommandAction [within thread] context: Attempting to endAction 'CreateImagePlaceholder', executionIndex: '0'
2017-06-07 00:50:11,244 ERROR [org.ovirt.engine.core.bll.storage.lsm.CreateImagePlaceholderCommand] (org.ovirt.thread.pool-6-thread-6) [5b9e22e1] Ending command 'org.ovirt.engine.core.bll.storage.lsm.CreateImagePlaceholderCommand' with failure.

Version-Release number of selected component (if applicable):
Engine - 4.0.7.5-0.1.el7ev
VDSM - 

How reproducible:
50%

Steps to Reproduce:
1. Create a VM with bootable disk from template as thin
2. Start the VM
3. Move the disk to different storage domain

Actual results:
Migration failed with the above error

Expected results:
Migration should end successfully

Additional info:
Engine and VDSM logs attached

Comment 1 Eyal Shenitzky 2017-06-07 08:06:38 UTC
VDSM version - 4.19.15-1.el7ev.x86_64

Error on vdsm:

2017-06-07 00:49:01,496+0300 INFO  (tasks/7) [storage.ThreadPool.WorkerThread] START task 7ea3330c-a4d1-4a22-839b-ecc3adfdda5b (cmd=<bound method Task.commit of <storage.task.Task instance at 0x3509b00>>, args=None) (threadPool:208)
2017-06-07 00:49:01,610+0300 INFO  (tasks/7) [storage.Image] sdUUID=378bfb14-598d-4310-865b-4d53b37be51e imgUUID=8563aeec-bd5c-4fd3-88be-5af1082f229a chain=[<storage.blockVolume.BlockVolume object at 0x3642a90>, <storage.blockVolume.BlockVolume object at 0x36424d0>]  (image:285)
2017-06-07 00:49:01,705+0300 INFO  (tasks/7) [storage.Image] sdUUID=378bfb14-598d-4310-865b-4d53b37be51e imgUUID=8563aeec-bd5c-4fd3-88be-5af1082f229a chain=[<storage.blockVolume.BlockVolume object at 0x36b70d0>, <storage.blockVolume.BlockVolume object at 0x36b7090>]  (image:285)
2017-06-07 00:49:01,803+0300 INFO  (tasks/7) [storage.Image] sdUUID=16c793ca-cb18-4ba8-8172-a4a58b2b03f2 imgUUID=a49946d3-f012-4b26-a1a2-7be4b7db1d0a chain=[<storage.glusterVolume.GlusterVolume object at 0x37551d0>]  (image:285)
2017-06-07 00:49:01,876+0300 INFO  (tasks/7) [storage.Image] Create placeholder /rhev/data-center/92859452-7cca-4347-87b5-300474ef6ed5/16c793ca-cb18-4ba8-8172-a4a58b2b03f2/images/8563aeec-bd5c-4fd3-88be-5af1082f229a for image's volumes (image:152)
2017-06-07 00:49:01,905+0300 ERROR (tasks/7) [storage.Volume] Unexpected error (volume:1040)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/volume.py", line 1030, in create
    volParent.share(imgPath)
  File "/usr/share/vdsm/storage/volume.py", line 916, in share
    raise se.VolumeNonShareable(self)
VolumeNonShareable: Volume cannot be shared, it's not Shared/Template volume: [u'sdUUID: 16c793ca-cb18-4ba8-8172-a4a58b2b03f2', 'imgUUID: a49946d3-f012-4b26-a1a2-7be4b7db1d0a', 'volUUID: 51c7cf5a-4d24-4726-8828-a321006f298a']
2017-06-07 00:49:01,906+0300 ERROR (tasks/7) [storage.Image] Unexpected error (image:461)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 444, in _createTargetImage
    srcVolUUID=volParams['parent'])
  File "/usr/share/vdsm/storage/sd.py", line 763, in createVolume
    initialSize=initialSize)
  File "/usr/share/vdsm/storage/volume.py", line 1043, in create
    (srcVolUUID, volUUID, e))
VolumeCannotGetParent: Cannot get parent volume: ("Couldn't get parent 51c7cf5a-4d24-4726-8828-a321006f298a for volume 0331bf69-3328-4164-b691-590f183c04e3: Volume cannot be shared, it's not Shared/Template volume: [u'sdUUID: 16c793ca-cb18-4ba8-8172-a4a58b2b03f2', 'imgUUID: a49946d3-f012-4b26-a1a2-7be4b7db1d0a', 'volUUID: 51c7cf5a-4d24-4726-8828-a321006f298a']",)
2017-06-07 00:49:01,907+0300 ERROR (tasks/7) [storage.TaskManager.Task] (Task='7ea3330c-a4d1-4a22-839b-ecc3adfdda5b') Unexpected error (task:870)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 877, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 333, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/securable.py", line 79, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1712, in cloneImageStructure
    img.cloneStructure(sdUUID, imgUUID, dstSdUUID)
  File "/usr/share/vdsm/storage/image.py", line 670, in cloneStructure
    self._createTargetImage(sdCache.produce(dstSdUUID), sdUUID, imgUUID)
  File "/usr/share/vdsm/storage/image.py", line 444, in _createTargetImage
    srcVolUUID=volParams['parent'])
  File "/usr/share/vdsm/storage/sd.py", line 763, in createVolume
    initialSize=initialSize)
  File "/usr/share/vdsm/storage/volume.py", line 1043, in create
    (srcVolUUID, volUUID, e))
VolumeCannotGetParent: Cannot get parent volume: ("Couldn't get parent 51c7cf5a-4d24-4726-8828-a321006f298a for volume 0331bf69-3328-4164-b691-590f183c04e3: Volume cannot be shared, it's not Shared/Template volume: [u'sdUUID: 16c793ca-cb18-4ba8-8172-a4a58b2b03f2', 'imgUUID: a49946d3-f012-4b26-a1a2-7be4b7db1d0a', 'volUUID: 51c7cf5a-4d24-4726-8828-a321006f298a']",)

Comment 2 Allon Mureinik 2017-06-08 10:33:53 UTC
(In reply to Eyal Shenitzky from comment #0)
> Version-Release number of selected component (if applicable):
> Engine - 4.0.7.5-0.1.el7ev

(In reply to Eyal Shenitzky from comment #1)
> VDSM version - 4.19.15-1.el7ev.x86_64

These are quite old versions...
Can you reproduce this with modern oVity 4.1.z versions?

Comment 3 Red Hat Bugzilla Rules Engine 2017-06-08 10:34:00 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 4 Eyal Shenitzky 2017-06-11 05:29:18 UTC
The version of the engine in the bug description is wrong,
correct version is 4.1.3.1-0.1.el7

Comment 7 Allon Mureinik 2017-07-02 20:38:17 UTC
4.1.4 is planned as a minimal, fast, z-stream version to fix any open issues we may have in supporting the upcoming EL 7.4.

Pushing out anything unrelated, although if there's a minimal/trival, SAFE fix that's ready on time, we can consider introducing it in 4.1.4.

Comment 11 Allon Mureinik 2017-08-07 12:26:31 UTC
Moving out to 4.1.6 so this doesn't block 4.1.5

QE - can you advise how to reproduce this?
Does this happen consistently/frequently in your envs?

Comment 12 Raz Tamir 2017-08-07 12:34:29 UTC
No,

We are currently seeing only bug #1422508 in LSM operations, so it might hide other issues

Comment 13 Allon Mureinik 2017-08-07 12:42:55 UTC
(In reply to Raz Tamir from comment #12)
> No,
> 
> We are currently seeing only bug #1422508 in LSM operations, so it might
> hide other issues

Let's retest once that one is fixed.
Marking as blocked.

Comment 14 Allon Mureinik 2017-08-09 07:49:41 UTC
(In reply to Allon Mureinik from comment #13)
> (In reply to Raz Tamir from comment #12)
> > No,
> > 
> > We are currently seeing only bug #1422508 in LSM operations, so it might
> > hide other issues
> 
> Let's retest once that one is fixed.
> Marking as blocked.

Raz - bug #1422508 should be fixed in the upcoming 4.1.5 build.
Once you get such a build, can someone from your team re-run the automation suite and attach updated logs to this BZ (or close it if we can't reproduce it)?

Thanks!

Comment 15 Raz Tamir 2017-08-14 07:56:50 UTC
Sure,

Lilach, can you please check in our last execution if we see such error?

Comment 16 Lilach Zitnitski 2017-08-16 06:12:26 UTC
(In reply to Raz Tamir from comment #15)
> Sure,
> 
> Lilach, can you please check in our last execution if we see such error?

I haven't seen it in the last few execution.

Comment 17 Allon Mureinik 2017-08-16 14:58:43 UTC
(In reply to Lilach Zitnitski from comment #16)
> (In reply to Raz Tamir from comment #15)
> > Sure,
> > 
> > Lilach, can you please check in our last execution if we see such error?
> 
> I haven't seen it in the last few execution.

Thanks Lilach.
Based on the comments above, I'm closing this bug as unreproducible. If anyone was able to reproduce it, please reopen and attach the engine and vdsm logs.


Note You need to log in before you can comment on or make changes to this bug.