Bug 965676 - engine: vm image remains locked with 'ArrayIndexOutOfBoundsException: -1' when vdsm restarts on spm after complete of task "syncData"
engine: vm image remains locked with 'ArrayIndexOutOfBoundsException: -1' whe...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
x86_64 Unspecified
unspecified Severity high
: ---
: 3.3.0
Assigned To: Daniel Erez
yeylon@redhat.com
storage
: ZStream
: 965681 967600 (view as bug list)
Depends On: 987783
Blocks: 971883
  Show dependency treegraph
 
Reported: 2013-05-21 10:14 EDT by Dafna Ron
Modified: 2016-04-18 02:53 EDT (History)
13 users (show)

See Also:
Fixed In Version: is2
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 971883 (view as bug list)
Environment:
Last Closed: 2014-01-21 17:15:30 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
scohen: Triaged+


Attachments (Terms of Use)
logs (3.78 MB, application/x-gzip)
2013-05-21 10:14 EDT, Dafna Ron
no flags Details
CloneImageGroupStructureVDSCommand logs (1.72 MB, application/x-gzip)
2013-05-23 05:25 EDT, Dafna Ron
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 15133 None None None Never

  None (edit)
Description Dafna Ron 2013-05-21 10:14:06 EDT
Created attachment 751191 [details]
logs

Description of problem:

vm is running on hsm, I restarted the vdsm right when syncData task (part of live storage migration) has completed. 
vm's disk remain locked, engine has endless ArrayIndexOutOfBoundsException: -1 exception + event log in loop: 

User admin@internal finished moving disk NFS-RHEL6_iSCSI_Disk1 to domain Dafna-32-03.

and after I shut down the vm we are getting failed event log errors in a loop: 


Version-Release number of selected component (if applicable):

User admin@internal have failed to move disk NFS-RHEL6_iSCSI_Disk1 to domain Dafna-32-03.

How reproducible:

100%

Steps to Reproduce:
1. in iscsi storage with two hosts, create a vm from template
2. run the vm on hsm
3. start a live storage migration on the vm disk
4. restart vdsm on the spm when syncData task has change state to completed successfully in SPM

Actual results:

vm is stuck in image locked and we are getting: java.lang.ArrayIndexOutOfBoundsException: -1

event log runs events in loops 

Expected results:

we should be able to recover from the vdsm restart

Additional info: logs


after vdsm restart: 

2013-05-21 16:43:19,170 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-4-thread-49) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type LiveMigrateDisk threw an exception: javax.ejb.EJBException: java.lang.ArrayIndexOutOfBoundsException: -1
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx(CMTTxInterceptor.java:191) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:237) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:374) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:218) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3.jar:7.2.0.Final-redhat-3]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee.jar:7.2.0.Final-redhat-3]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee.jar:7.2.0.Final-redhat-3]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:165) [jboss-as-ee.jar:7.2.0.Final-redhat-3]
        at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:182) [jboss-as-ee.jar:7.2.0.Final-redhat-3]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final-redhat-2]
        at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:72) [jboss-as-ee.jar:7.2.0.Final-redhat-3]
        at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view8.endAction(Unknown Source) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask.EndCommandAction(EntityAsyncTask.java:147) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask.access$000(EntityAsyncTask.java:26) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.EntityAsyncTask$1.run(EntityAsyncTask.java:107) [engine-bll.jar:]
        at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:71) [engine-utils.jar:]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_19]
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) [rt.jar:1.7.0_19]
        at java.util.concurrent.FutureTask.run(FutureTask.java:166) [rt.jar:1.7.0_19]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_19]
Comment 1 Dafna Ron 2013-05-21 12:14:27 EDT
*** Bug 965681 has been marked as a duplicate of this bug. ***
Comment 4 Dafna Ron 2013-05-23 05:24:37 EDT
same issue happens when I restart vdsm right when FINISH of CloneImageGroupStructureVDSCommand is logged. 

in this case we do not remove the disks from either the target or the des but it means that we cannot move the disks to that domain (volume already exists error) and need GSS intervention. 
we also cannot be 100% sure that the correct disks will be cleaned by GSS (so if by accident the original disks will be remove we will be left with empty lv's since syncData has not run yet). 

attaching the logs from the CloneImageGroupStructureVDSCommand test as well. 
and here is the lvs output in from the hosts: 


vm created: 

[root@cougar01 ~]# lvs
  LV                                   VG                                   Attr      LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  f82a0d58-0791-4137-b1e6-22a8794acd2a 38755249-4bb3-4841-bf5b-05f4a521514d -wi-a----   2.00g                                             
  ids                                  38755249-4bb3-4841-bf5b-05f4a521514d -wi-ao--- 128.00m                                             
  inbox                                38755249-4bb3-4841-bf5b-05f4a521514d -wi-a---- 128.00m                                             
  leases                               38755249-4bb3-4841-bf5b-05f4a521514d -wi-a----   2.00g                                             
  master                               38755249-4bb3-4841-bf5b-05f4a521514d -wi-a----   1.00g                                             
  metadata                             38755249-4bb3-4841-bf5b-05f4a521514d -wi-a---- 512.00m                                             
  outbox                               38755249-4bb3-4841-bf5b-05f4a521514d -wi-a---- 128.00m                                             
  4ddf8f4f-5fdc-46e2-9885-2016b67e0192 81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi------   1.00g                                             
  f82a0d58-0791-4137-b1e6-22a8794acd2a 81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi------   2.00g                                             
  ids                                  81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-ao--- 128.00m                                             
  inbox                                81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a---- 128.00m                                             
  leases                               81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a----   2.00g                                             
  master                               81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-ao---   1.00g                                             
  metadata                             81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a---- 512.00m                                             
  outbox                               81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a---- 128.00m                                             
  lv_root                              vg0                                  -wi-ao--- 449.82g                                             
  lv_swap                              vg0                                  -wi-ao---  15.74g     

after CloneImageGroupStructureVDSCommand: 


[root@cougar01 ~]# lvs
  LV                                   VG                                   Attr      LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  0d5899ae-490a-4015-926d-d523138050c6 38755249-4bb3-4841-bf5b-05f4a521514d -wi------   1.00g                                             
  4ddf8f4f-5fdc-46e2-9885-2016b67e0192 38755249-4bb3-4841-bf5b-05f4a521514d -wi------   2.00g                                             
  f82a0d58-0791-4137-b1e6-22a8794acd2a 38755249-4bb3-4841-bf5b-05f4a521514d -wi------   2.00g                                             
  ids                                  38755249-4bb3-4841-bf5b-05f4a521514d -wi-ao--- 128.00m                                             
  inbox                                38755249-4bb3-4841-bf5b-05f4a521514d -wi-a---- 128.00m                                             
  leases                               38755249-4bb3-4841-bf5b-05f4a521514d -wi-a----   2.00g                                             
  master                               38755249-4bb3-4841-bf5b-05f4a521514d -wi-a----   1.00g                                             
  metadata                             38755249-4bb3-4841-bf5b-05f4a521514d -wi-a---- 512.00m                                             
  outbox                               38755249-4bb3-4841-bf5b-05f4a521514d -wi-a---- 128.00m                                             
  0d5899ae-490a-4015-926d-d523138050c6 81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi------   1.00g                                             
  4ddf8f4f-5fdc-46e2-9885-2016b67e0192 81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi------   2.00g                                             
  f82a0d58-0791-4137-b1e6-22a8794acd2a 81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi------   2.00g                                             
  ids                                  81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-ao--- 128.00m                                             
  inbox                                81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a---- 128.00m                                             
  leases                               81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a----   2.00g                                             
  master                               81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-ao---   1.00g                                             
  metadata                             81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a---- 512.00m                                             
  outbox                               81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a---- 128.00m                                             
  lv_root                              vg0                                  -wi-ao--- 449.82g                                             
  lv_swap                              vg0                                  -wi-ao---  15.74g       


after failure of move: 

[root@cougar01 ~]# lvs
  LV                                   VG                                   Attr      LSize   Pool Origin Data%  Move Log Cpy%Sync Convert
  0d5899ae-490a-4015-926d-d523138050c6 38755249-4bb3-4841-bf5b-05f4a521514d -wi------   1.00g                                             
  4ddf8f4f-5fdc-46e2-9885-2016b67e0192 38755249-4bb3-4841-bf5b-05f4a521514d -wi------   2.00g                                             
  f82a0d58-0791-4137-b1e6-22a8794acd2a 38755249-4bb3-4841-bf5b-05f4a521514d -wi------   2.00g                                             
  ids                                  38755249-4bb3-4841-bf5b-05f4a521514d -wi-ao--- 128.00m                                             
  inbox                                38755249-4bb3-4841-bf5b-05f4a521514d -wi-a---- 128.00m                                             
  leases                               38755249-4bb3-4841-bf5b-05f4a521514d -wi-a----   2.00g                                             
  master                               38755249-4bb3-4841-bf5b-05f4a521514d -wi-a----   1.00g                                             
  metadata                             38755249-4bb3-4841-bf5b-05f4a521514d -wi-a---- 512.00m                                             
  outbox                               38755249-4bb3-4841-bf5b-05f4a521514d -wi-a---- 128.00m                                             
  0d5899ae-490a-4015-926d-d523138050c6 81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi------   1.00g                                             
  4ddf8f4f-5fdc-46e2-9885-2016b67e0192 81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi------   2.00g                                             
  f82a0d58-0791-4137-b1e6-22a8794acd2a 81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi------   2.00g                                             
  ids                                  81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-ao--- 128.00m                                             
  inbox                                81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a---- 128.00m                                             
  leases                               81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a----   2.00g                                             
  master                               81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-ao---   1.00g                                             
  metadata                             81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a---- 512.00m                                             
  outbox                               81ef11d0-4c0c-47b4-8953-d61a6af442d8 -wi-a---- 128.00m                                             
  lv_root                              vg0                                  -wi-ao--- 449.82g                                             
  lv_swap                              vg0                                  -wi-ao---  15.74g          

 

[root@cougar01 ~]# vdsClient -s 0 getStorageDomainInfo 38755249-4bb3-4841-bf5b-05f4a521514d
	uuid = 38755249-4bb3-4841-bf5b-05f4a521514d
	vguuid = 4Ceo5k-vMud-sVKL-blqX-JwKo-ETN6-8ao4L9
	lver = -1
	state = OK
	version = 3
	role = Master
	pool = ['7fd33b43-a9f4-4eb7-a885-e9583a929ceb']
	spm_id = -1
	type = ISCSI
	class = Data
	master_ver = 0
	name = Dafna-32-03

[root@cougar01 ~]# vdsClient -s 0 getStorageDomainInfo 81ef11d0-4c0c-47b4-8953-d61a6af442d8
	uuid = 81ef11d0-4c0c-47b4-8953-d61a6af442d8
	vguuid = Eq3hKP-LB4o-CSHb-U91Z-JKKN-4eQV-Pecf1l
	lver = 43
	state = OK
	version = 3
	role = Master
	pool = ['7fd33b43-a9f4-4eb7-a885-e9583a929ceb']
	spm_id = 1
	type = ISCSI
	class = Data
	master_ver = 3697
	name = Dafna-32-02
Comment 5 Dafna Ron 2013-05-23 05:25:34 EDT
Created attachment 752056 [details]
CloneImageGroupStructureVDSCommand logs
Comment 10 Daniel Erez 2013-07-09 07:29:30 EDT
*** Bug 967600 has been marked as a duplicate of this bug. ***
Comment 11 vvyazmin@redhat.com 2013-07-23 10:50:49 EDT
Can’t be verified, this bug depend on BZ987460, tested on RHEVM 3.3 - IS6 environment.
Comment 12 vvyazmin@redhat.com 2013-08-18 13:04:38 EDT
Tested on FCP Data Center

Verified, tested on RHEVM 3.3 - IS10 environment:

RHEVM:  rhevm-3.3.0-0.15.master.el6ev.noarch
PythonSDK:  rhevm-sdk-python-3.3.0.10-1.el6ev.noarch
VDSM:  vdsm-4.12.0-61.git8178ec2.el6ev.x86_64
LIBVIRT:  libvirt-0.10.2-18.el6_4.9.x86_64
QEMU & KVM:  qemu-kvm-rhev-0.12.1.2-2.355.el6_4.5.x86_64
SANLOCK:  sanlock-2.8-1.el6.x86_64
Comment 15 Itamar Heim 2014-01-21 17:15:30 EST
Closing - RHEV 3.3 Released
Comment 16 Itamar Heim 2014-01-21 17:22:39 EST
Closing - RHEV 3.3 Released

Note You need to log in before you can comment on or make changes to this bug.