+++ This bug was initially created as a clone of Bug #966618 +++ Description of problem: after we failed to move a vm in pause state the cloneImageStructure cannot be cleaned without manual intervention by gss (as in stopTask/clearTask in vds and restart of engine). if the user will try to start the vm and LSM the disk again we will get volume already exists error. Version-Release number of selected component (if applicable): sf17.1 How reproducible: 100% Steps to Reproduce: 1. in iscsi storage with two hosts, create and run a vm with run-once as paused on the hsm 2. try to live migrate the vm 3. manually clean the task from spm and restart engine 4. start the vm 5. try to migrate the disk Actual results: engine fails to clean the task which causes lv's to remain in the target domain and we cannot move the images to the new domain without gss involvement. Expected results: even if we fail to LSM we should still be able to clear the task and roll back so that the user can migrate again. Additional info: logs ab5d4053-b881-4503-9ba0-7427b2514801::ERROR::2013-05-23 17:36:09,641::task::850::TaskManager.Task::(_setError) Task=`ab5d4053-b881-4503-9ba0-7427b2514801`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 857, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/task.py", line 318, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper return f(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1773, in cloneImageStructure image.Image(repoPath).cloneStructure(sdUUID, imgUUID, dstSdUUID) File "/usr/share/vdsm/storage/image.py", line 649, in cloneStructure self._createTargetImage(sdCache.produce(dstSdUUID), sdUUID, imgUUID) File "/usr/share/vdsm/storage/image.py", line 517, in _createTargetImage srcVolUUID=volParams['parent']) File "/usr/share/vdsm/storage/blockSD.py", line 610, in createVolume volUUID, desc, srcImgUUID, srcVolUUID) File "/usr/share/vdsm/storage/volume.py", line 418, in create raise se.VolumeAlreadyExists(volUUID) VolumeAlreadyExists: Volume already exists: ('30e4d88e-e807-4fb9-9b41-39c988c338ad',) ab5d4053-b881-4503-9ba0-7427b2514801::DEBUG::2013-05-23 17:36:09,642::task::869::TaskManager.Task::(_run) Task=`ab5d4053-b881-4503-9ba0-7427b2514801`::Task._run: ab5d4053-b881-4503-9ba0-7427b2514801 () {} failed - stopping task engine: 2013-05-23 17:28:32,019 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (pool-4-thread-38) Unable to get value of property: vds for class org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand 2013-05-23 17:28:32,033 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-4-thread-38) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateDisk threw an exception: javax.ejb.EJBException: java.l ang.ArrayIndexOutOfBoundsException: -1 at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx(CMTTxInterceptor.java:191) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:237) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:374) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:218) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:165) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:182) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:72) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view8.endAction(Unknown Source) [engine-bll.jar:] at org.ovirt.engine.core.bll.EntityAsyncTask.EndCommandAction(EntityAsyncTask.java:147) [engine-bll.jar:] at org.ovirt.engine.core.bll.EntityAsyncTask.access$000(EntityAsyncTask.java:26) [engine-bll.jar:] at org.ovirt.engine.core.bll.EntityAsyncTask$1.run(EntityAsyncTask.java:107) [engine-bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:71) [engine-utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_19] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_19] at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_19] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_19] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_19] at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_19] Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
Created attachment 766168 [details] vdms log
Created attachment 766169 [details] engine log reproduce using sf18.2 logs attached from logs - 468f6937-14cd-47b7-892b-406c46f167c1::ERROR::2013-06-27 17:23:53,557::task::850::TaskManager.Task::(_setError) Task=`468f6937-14cd-47b7-892b-406c46f167c1`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 857, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/task.py", line 318, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper return f(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1773, in cloneImageStructure image.Image(repoPath).cloneStructure(sdUUID, imgUUID, dstSdUUID) File "/usr/share/vdsm/storage/image.py", line 649, in cloneStructure self._createTargetImage(sdCache.produce(dstSdUUID), sdUUID, imgUUID) File "/usr/share/vdsm/storage/image.py", line 517, in _createTargetImage srcVolUUID=volParams['parent']) File "/usr/share/vdsm/storage/blockSD.py", line 609, in createVolume volUUID, desc, srcImgUUID, srcVolUUID) File "/usr/share/vdsm/storage/volume.py", line 418, in create raise se.VolumeAlreadyExists(volUUID) VolumeAlreadyExists: Volume already exists: ('3a31b139-b457-4989-941e-0fd99d5021e7',) 468f6937-14cd-47b7-892b-406c46f167c1::DEBUG::2013-06-27 17:23:53,558::task::869::TaskManager.Task::(_run) Task=`468f6937-14cd-47b7-892b-406c46f167c1`::Task._run: 468f6937-14cd-47b7-892b-406c46f167c1 () {} failed - stopping task
The original issue was fixed, the remaining issue is that vdsm does not clean up the leftovers of the disk on the target domain. Need to reproduce and understand under what conditions this happens.
*** Bug 970974 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-1048.html