Bug 966618 - engine: after we fail to LSM a disk for a vm in pause state engine fails to clean cloneImageStructure task with ArrayIndexOutOfBoundsException: -1 (can't migrate disks because of orphan images on target domain)
Summary: engine: after we fail to LSM a disk for a vm in pause state engine fails to c...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 3.3.0
Assignee: Daniel Erez
QA Contact: yeylon@redhat.com
URL:
Whiteboard: storage
: 968894 (view as bug list)
Depends On: 987783
Blocks: 972696
TreeView+ depends on / blocked
 
Reported: 2013-05-23 14:47 UTC by Dafna Ron
Modified: 2016-04-18 06:45 UTC (History)
11 users (show)

Fixed In Version: is2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 972696 (view as bug list)
Environment:
Last Closed: 2014-01-21 22:18:56 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (4.07 MB, application/x-gzip)
2013-05-23 14:47 UTC, Dafna Ron
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 15133 0 None None None Never

Description Dafna Ron 2013-05-23 14:47:21 UTC
Created attachment 752256 [details]
logs

Description of problem:

after we failed to move a vm in pause state the cloneImageStructure cannot be cleaned without manual intervention by gss (as in stopTask/clearTask in vds and restart of engine). 
if the user will try to start the vm and LSM the disk again we will get volume already exists error. 

Version-Release number of selected component (if applicable):

sf17.1

How reproducible:

100%

Steps to Reproduce:
1. in iscsi storage with two hosts, create and run a vm with run-once as paused on the hsm
2. try to live migrate the vm
3. manually clean the task from spm and restart engine
4. start the vm
5. try to migrate the disk

Actual results:

engine fails to clean the task which causes lv's to remain in the target domain and we cannot move the images to the new domain without gss involvement.
 
Expected results:

even if we fail to LSM we should still be able to clear the task and roll back so that the user can migrate again. 

Additional info: logs


ab5d4053-b881-4503-9ba0-7427b2514801::ERROR::2013-05-23 17:36:09,641::task::850::TaskManager.Task::(_setError) Task=`ab5d4053-b881-4503-9ba0-7427b2514801`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 857, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 318, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 68, in wrapper
    return f(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1773, in cloneImageStructure
    image.Image(repoPath).cloneStructure(sdUUID, imgUUID, dstSdUUID)
  File "/usr/share/vdsm/storage/image.py", line 649, in cloneStructure
    self._createTargetImage(sdCache.produce(dstSdUUID), sdUUID, imgUUID)
  File "/usr/share/vdsm/storage/image.py", line 517, in _createTargetImage
    srcVolUUID=volParams['parent'])
  File "/usr/share/vdsm/storage/blockSD.py", line 610, in createVolume
    volUUID, desc, srcImgUUID, srcVolUUID)
  File "/usr/share/vdsm/storage/volume.py", line 418, in create
    raise se.VolumeAlreadyExists(volUUID)
VolumeAlreadyExists: Volume already exists: ('30e4d88e-e807-4fb9-9b41-39c988c338ad',)
ab5d4053-b881-4503-9ba0-7427b2514801::DEBUG::2013-05-23 17:36:09,642::task::869::TaskManager.Task::(_run) Task=`ab5d4053-b881-4503-9ba0-7427b2514801`::Task._run: ab5d4053-b881-4503-9ba0-7427b2514801 () {} failed - stopping task


engine: 

2013-05-23 17:28:32,019 WARN  [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (pool-4-thread-38) Unable to get value of property: vds for class org.ovirt.engine.core.bll.lsm.LiveMigrateDiskCommand
2013-05-23 17:28:32,033 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-4-thread-38) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateDisk threw an exception: javax.ejb.EJBException: java.l
ang.ArrayIndexOutOfBoundsException: -1
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx(CMTTxInterceptor.java:191) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:237) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:374) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:218) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:165) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:182) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:72) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view8.endAction(Unknown Source) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask.EndCommandAction(EntityAsyncTask.java:147) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask.access$000(EntityAsyncTask.java:26) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask$1.run(EntityAsyncTask.java:107) [engine-bll.jar:]
        at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:71) [engine-utils.jar:]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_19]
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_19]
        at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_19]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_19]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_19]
        at java.lang.Thread.run(Thread.java:722) [rt.jar:1.7.0_19]
Caused by: java.lang.ArrayIndexOutOfBoundsException: -1

Comment 2 Daniel Erez 2013-07-09 07:40:26 UTC
*** Bug 968894 has been marked as a duplicate of this bug. ***

Comment 3 vvyazmin@redhat.com 2013-07-23 14:40:33 UTC
Can’t be verified, this bug depend on 987460, tested on RHEVM 3.3 - IS6 environment.

Comment 4 vvyazmin@redhat.com 2013-08-19 09:30:26 UTC
Tested on FCP Data Center

Verified, tested on RHEVM 3.3 - IS10 environment:

RHEVM:  rhevm-3.3.0-0.15.master.el6ev.noarch
PythonSDK:  rhevm-sdk-python-3.3.0.10-1.el6ev.noarch
VDSM:  vdsm-4.12.0-61.git8178ec2.el6ev.x86_64
LIBVIRT:  libvirt-0.10.2-18.el6_4.9.x86_64
QEMU & KVM:  qemu-kvm-rhev-0.12.1.2-2.355.el6_4.5.x86_64
SANLOCK:  sanlock-2.8-1.el6.x86_64

Comment 7 Itamar Heim 2014-01-21 22:18:56 UTC
Closing - RHEV 3.3 Released

Comment 8 Itamar Heim 2014-01-21 22:25:12 UTC
Closing - RHEV 3.3 Released


Note You need to log in before you can comment on or make changes to this bug.