Bug 972695

Summary: engine: vm is stuck in CreateSnapshot step during SLM because engine fails to clean the createVolume task (ArrayIndexOutOfBoundsException: -1 error)
Product: Red Hat Enterprise Virtualization Manager Reporter: Idith Tal-Kohen <italkohe>
Component: ovirt-engineAssignee: Daniel Erez <derez>
Status: CLOSED ERRATA QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: abaron, acanan, acathrow, amureini, cpelland, derez, dron, hateya, iheim, jbiddle, jkt, lpeer, Rhev-m-bugs, scohen, yeylon
Target Milestone: ---Keywords: ZStream
Target Release: 3.2.1   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Previously, a virtual machine would become stuck on the CreateSnapshot step during live storage migration because the engine failed to clean the createVolume task. Now, the engine cleans the createVolume task appropriately, and virtual machines no longer become stuck on the CreateSnapshot step.
Story Points: ---
Clone Of: 966153 Environment:
Last Closed: 2013-07-16 13:40:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 966153    
Bug Blocks:    

Description Idith Tal-Kohen 2013-06-10 12:23:28 UTC
+++ This bug was initially created as a clone of Bug #966153 +++

Description of problem:

I restarted vdsm during End Command of CreateSnapshotCommand. 
engine gets a ArrayIndexOutOfBoundsException: -1 error. 
it does not report success or failure for the live migration but event log reports that the snapshot was created successfully and we free the vm 

Snapshot 'Auto-generated for Live Storage Migration' creation for VM 'vcdvf' has been completed.

I tried to live storage migrate the disk again to see if it's just an exception in the log but what happens is that we send the createVolume task and although its finished we do not clean it so the vm is stuck in that step forever. 


this bug might be related to 965676 but is in a different step and has different results. 

Version-Release number of selected component (if applicable):

sf17.1

How reproducible:

100%

Steps to Reproduce:
1. create a vm from snapshot and run it on spm. 
2. LSM the vm disk and when End Action of CreateSnapshotCommand is run restart the vdsm on spm
3. when vm lock is released LSM the disk again

Actual results:

the firs time we get ArrayIndexOutOfBoundsException: -1 but vm is freed. 
the second time engine fails to clean the createVolume task which is finished successfully and vm is locked in that step forever. 

Expected results:

we should not hey ArrayIndexOutOfBoundsException: -1  error. 

Additional info: logs

both tasks are shown in vdsm: 

ab69c85f-c506-4756-b269-5fb7e824ff7c :
        verb = createVolume
        id = ab69c85f-c506-4756-b269-5fb7e824ff7c
31689665-a832-47a7-a7e8-e62c516867c7 :
        verb = createVolume
        id = 31689665-a832-47a7-a7e8-e62c516867c7

{'status': {'message': 'OK', 'code': 0}, 'allTasksStatus': {'ab69c85f-c506-4756-b269-5fb7e824ff7c': {'message': '1 jobs completed successfully', 'code': 0, 'taskID': 'ab69c85f-c506-4756-b269-5fb7e824ff7c', 'taskResult': 'success', 'task
State': 'finished'}, '31689665-a832-47a7-a7e8-e62c516867c7': {'message': '1 jobs completed successfully', 'code': 0, 'taskID': '31689665-a832-47a7-a7e8-e62c516867c7', 'taskResult': 'success', 'taskState': 'finished'}}}




2013-05-22 18:12:35,051 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateVmDisksCommand] (pool-4-thread-49) Ending command with failure: org.ovirt.engine.core.bll.lsm.LiveMigrateVmDisksCommand
2013-05-22 18:12:35,055 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-4-thread-49) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateVmDisks threw an exception: javax.ejb.EJBException: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx(CMTTxInterceptor.java:191) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:237) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:374) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:218) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:165) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:182) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:72) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view8.endAction(Unknown Source) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask.EndCommandAction(EntityAsyncTask.java:147) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask.access$000(EntityAsyncTask.java:26) [engine-bll.jar:]

Comment 3 Aharon Canan 2013-07-01 14:28:50 UTC
verified using sf18.2

no "ArrayIndexOutOfBoundsException: -1" in logs
all tasks cleaned

be aware that LSM after this secnario is still not working as "Volume already exists" in destination, 
the only way to workaround it is to manually remove relevant lv.

Comment 5 errata-xmlrpc 2013-07-16 13:40:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1048.html