engine: after we fail to LSM a disk for a vm in pause state engine fails to clean cloneImageStructure task with ArrayIndexOutOfBoundsException: -1 (can't migrate disks because of orphan images on target domain)
Created attachment 752256[details]
logs
Description of problem:
after we failed to move a vm in pause state the cloneImageStructure cannot be cleaned without manual intervention by gss (as in stopTask/clearTask in vds and restart of engine).
if the user will try to start the vm and LSM the disk again we will get volume already exists error.
Version-Release number of selected component (if applicable):
sf17.1
How reproducible:
100%
Steps to Reproduce:
1. in iscsi storage with two hosts, create and run a vm with run-once as paused on the hsm
2. try to live migrate the vm
3. manually clean the task from spm and restart engine
4. start the vm
5. try to migrate the disk
Actual results:
engine fails to clean the task which causes lv's to remain in the target domain and we cannot move the images to the new domain without gss involvement.
Expected results:
even if we fail to LSM we should still be able to clear the task and roll back so that the user can migrate again.
Additional info: logs
ab5d4053-b881-4503-9ba0-7427b2514801::ERROR::2013-05-23 17:36:09,641::task::850::TaskManager.Task::(_setError) Task=`ab5d4053-b881-4503-9ba0-7427b2514801`::Unexpected error
Traceback (most recent call last):
File "/usr/share/vdsm/storage/task.py", line 857, in _run
return fn(*args, **kargs)
File "/usr/share/vdsm/storage/task.py", line 318, in run
return self.cmd(*self.argslist, **self.argsdict)
File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper
return f(self, *args, **kwargs)
File "/usr/share/vdsm/storage/sp.py", line 1773, in cloneImageStructure
image.Image(repoPath).cloneStructure(sdUUID, imgUUID, dstSdUUID)
File "/usr/share/vdsm/storage/image.py", line 649, in cloneStructure
self._createTargetImage(sdCache.produce(dstSdUUID), sdUUID, imgUUID)
File "/usr/share/vdsm/storage/image.py", line 517, in _createTargetImage
srcVolUUID=volParams['parent'])
File "/usr/share/vdsm/storage/blockSD.py", line 610, in createVolume
volUUID, desc, srcImgUUID, srcVolUUID)
File "/usr/share/vdsm/storage/volume.py", line 418, in create
raise se.VolumeAlreadyExists(volUUID)
VolumeAlreadyExists: Volume already exists: ('30e4d88e-e807-4fb9-9b41-39c988c338ad',)
ab5d4053-b881-4503-9ba0-7427b2514801::DEBUG::2013-05-23 17:36:09,642::task::869::TaskManager.Task::(_run) Task=`ab5d4053-b881-4503-9ba0-7427b2514801`::Task._run: ab5d4053-b881-4503-9ba0-7427b2514801 () {} failed - stopping task
engine:
2013-05-23 17:28:32,019 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (pool-4-thread-38) Unable to get value of property: vds for class org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand
2013-05-23 17:28:32,033 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-4-thread-38) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateDisk threw an exception: javax.ejb.EJBException: java.l
ang.ArrayIndexOutOfBoundsException: -1
at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx(CMTTxInterceptor.java:191) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:237) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:374) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:218) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:165) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:182) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:72) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view8.endAction(Unknown Source) [engine-bll.jar:]
at org.ovirt.engine.core.bll.EntityAsyncTask.EndCommandAction(EntityAsyncTask.java:147) [engine-bll.jar:]
at org.ovirt.engine.core.bll.EntityAsyncTask.access$000(EntityAsyncTask.java:26) [engine-bll.jar:]
at org.ovirt.engine.core.bll.EntityAsyncTask$1.run(EntityAsyncTask.java:107) [engine-bll.jar:]
at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:71) [engine-utils.jar:]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_19]
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_19]
at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_19]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_19]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_19]
at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_19]
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1
Created attachment 752256 [details] logs Description of problem: after we failed to move a vm in pause state the cloneImageStructure cannot be cleaned without manual intervention by gss (as in stopTask/clearTask in vds and restart of engine). if the user will try to start the vm and LSM the disk again we will get volume already exists error. Version-Release number of selected component (if applicable): sf17.1 How reproducible: 100% Steps to Reproduce: 1. in iscsi storage with two hosts, create and run a vm with run-once as paused on the hsm 2. try to live migrate the vm 3. manually clean the task from spm and restart engine 4. start the vm 5. try to migrate the disk Actual results: engine fails to clean the task which causes lv's to remain in the target domain and we cannot move the images to the new domain without gss involvement. Expected results: even if we fail to LSM we should still be able to clear the task and roll back so that the user can migrate again. Additional info: logs ab5d4053-b881-4503-9ba0-7427b2514801::ERROR::2013-05-23 17:36:09,641::task::850::TaskManager.Task::(_setError) Task=`ab5d4053-b881-4503-9ba0-7427b2514801`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 857, in _run return fn(*args, **kargs) File "/usr/share/vdsm/storage/task.py", line 318, in run return self.cmd(*self.argslist, **self.argsdict) File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper return f(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 1773, in cloneImageStructure image.Image(repoPath).cloneStructure(sdUUID, imgUUID, dstSdUUID) File "/usr/share/vdsm/storage/image.py", line 649, in cloneStructure self._createTargetImage(sdCache.produce(dstSdUUID), sdUUID, imgUUID) File "/usr/share/vdsm/storage/image.py", line 517, in _createTargetImage srcVolUUID=volParams['parent']) File "/usr/share/vdsm/storage/blockSD.py", line 610, in createVolume volUUID, desc, srcImgUUID, srcVolUUID) File "/usr/share/vdsm/storage/volume.py", line 418, in create raise se.VolumeAlreadyExists(volUUID) VolumeAlreadyExists: Volume already exists: ('30e4d88e-e807-4fb9-9b41-39c988c338ad',) ab5d4053-b881-4503-9ba0-7427b2514801::DEBUG::2013-05-23 17:36:09,642::task::869::TaskManager.Task::(_run) Task=`ab5d4053-b881-4503-9ba0-7427b2514801`::Task._run: ab5d4053-b881-4503-9ba0-7427b2514801 () {} failed - stopping task engine: 2013-05-23 17:28:32,019 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (pool-4-thread-38) Unable to get value of property: vds for class org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand 2013-05-23 17:28:32,033 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-4-thread-38) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateDisk threw an exception: javax.ejb.EJBException: java.l ang.ArrayIndexOutOfBoundsException: -1 at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx(CMTTxInterceptor.java:191) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:237) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:374) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:218) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:165) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:182) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2] at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:72) [jboss-as-ee.jar:7.2.0.Final-redhat-8] at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view8.endAction(Unknown Source) [engine-bll.jar:] at org.ovirt.engine.core.bll.EntityAsyncTask.EndCommandAction(EntityAsyncTask.java:147) [engine-bll.jar:] at org.ovirt.engine.core.bll.EntityAsyncTask.access$000(EntityAsyncTask.java:26) [engine-bll.jar:] at org.ovirt.engine.core.bll.EntityAsyncTask$1.run(EntityAsyncTask.java:107) [engine-bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:71) [engine-utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_19] at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_19] at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_19] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_19] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_19] at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_19] Caused by: java.lang.ArrayIndexOutOfBoundsException: -1