Bug 972695 - engine: vm is stuck in CreateSnapshot step during SLM because engine fails to clean the createVolume task (ArrayIndexOutOfBoundsException: -1 error)
Summary: engine: vm is stuck in CreateSnapshot step during SLM because engine fails to...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 3.2.1
Assignee: Daniel Erez
QA Contact: Aharon Canan
URL:
Whiteboard: storage
Depends On: 966153
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-06-10 12:23 UTC by Idith Tal-Kohen
Modified: 2016-02-10 20:33 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously, a virtual machine would become stuck on the CreateSnapshot step during live storage migration because the engine failed to clean the createVolume task. Now, the engine cleans the createVolume task appropriately, and virtual machines no longer become stuck on the CreateSnapshot step.
Clone Of: 966153
Environment:
Last Closed: 2013-07-16 13:40:59 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2013:1048 0 normal SHIPPED_LIVE rhevm 3.2.1 bug fix update 2013-07-19 05:45:47 UTC
oVirt gerrit 15133 0 None None None Never

Description Idith Tal-Kohen 2013-06-10 12:23:28 UTC
+++ This bug was initially created as a clone of Bug #966153 +++

Description of problem:

I restarted vdsm during End Command of CreateSnapshotCommand. 
engine gets a ArrayIndexOutOfBoundsException: -1 error. 
it does not report success or failure for the live migration but event log reports that the snapshot was created successfully and we free the vm 

Snapshot 'Auto-generated for Live Storage Migration' creation for VM 'vcdvf' has been completed.

I tried to live storage migrate the disk again to see if it's just an exception in the log but what happens is that we send the createVolume task and although its finished we do not clean it so the vm is stuck in that step forever. 


this bug might be related to 965676 but is in a different step and has different results. 

Version-Release number of selected component (if applicable):

sf17.1

How reproducible:

100%

Steps to Reproduce:
1. create a vm from snapshot and run it on spm. 
2. LSM the vm disk and when End Action of CreateSnapshotCommand is run restart the vdsm on spm
3. when vm lock is released LSM the disk again

Actual results:

the firs time we get ArrayIndexOutOfBoundsException: -1 but vm is freed. 
the second time engine fails to clean the createVolume task which is finished successfully and vm is locked in that step forever. 

Expected results:

we should not hey ArrayIndexOutOfBoundsException: -1  error. 

Additional info: logs

both tasks are shown in vdsm: 

ab69c85f-c506-4756-b269-5fb7e824ff7c :
        verb = createVolume
        id = ab69c85f-c506-4756-b269-5fb7e824ff7c
31689665-a832-47a7-a7e8-e62c516867c7 :
        verb = createVolume
        id = 31689665-a832-47a7-a7e8-e62c516867c7

{'status': {'message': 'OK', 'code': 0}, 'allTasksStatus': {'ab69c85f-c506-4756-b269-5fb7e824ff7c': {'message': '1 jobs completed successfully', 'code': 0, 'taskID': 'ab69c85f-c506-4756-b269-5fb7e824ff7c', 'taskResult': 'success', 'task
State': 'finished'}, '31689665-a832-47a7-a7e8-e62c516867c7': {'message': '1 jobs completed successfully', 'code': 0, 'taskID': '31689665-a832-47a7-a7e8-e62c516867c7', 'taskResult': 'success', 'taskState': 'finished'}}}




2013-05-22 18:12:35,051 ERROR [org.ovirt.engine.core.bll.lsm.LiveMigrateVmDisksCommand] (pool-4-thread-49) Ending command with failure: org.ovirt.engine.core.bll.lsm.LiveMigrateVmDisksCommand
2013-05-22 18:12:35,055 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-4-thread-49) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateVmDisks threw an exception: javax.ejb.EJBException: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx(CMTTxInterceptor.java:191) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:237) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:374) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:218) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:165) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:182) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:72) [jboss-as-ee.jar:7.2.0.Final-redhat-8]
        at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view8.endAction(Unknown Source) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask.EndCommandAction(EntityAsyncTask.java:147) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask.access$000(EntityAsyncTask.java:26) [engine-bll.jar:]

Comment 3 Aharon Canan 2013-07-01 14:28:50 UTC
verified using sf18.2

no "ArrayIndexOutOfBoundsException: -1" in logs
all tasks cleaned

be aware that LSM after this secnario is still not working as "Volume already exists" in destination, 
the only way to workaround it is to manually remove relevant lv.

Comment 5 errata-xmlrpc 2013-07-16 13:40:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1048.html


Note You need to log in before you can comment on or make changes to this bug.