Bug 968894

Summary: engine: when dst storage becomes inaccessible during cloneImageStructure task of LSM on the host the vm is running on only, engine fails to stop/clear task with 'ArrayIndexOutOfBoundsException: -1'
Product: Red Hat Enterprise Virtualization Manager Reporter: Dafna Ron <dron>
Component: ovirt-engineAssignee: Ayal Baron <abaron>
Status: CLOSED DUPLICATE QA Contact: Haim <hateya>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: acathrow, derez, iheim, jkt, lpeer, Rhev-m-bugs, scohen, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-09 07:40:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Dafna Ron 2013-05-30 08:41:23 UTC
Created attachment 754721 [details]
logs

Description of problem:

if dst domain becomes inaccessible on the host that the vm is running on (hsm host) during LSM step cloneImageStructure, engine rolls back but fails to stop/clear the task in spm with 'ArrayIndexOutOfBoundsException: -1'

the volumes on dst are not cleared as well and if we try to migrate the vm again we'll get image already exists which. 
whole scenario requires gss involvement. 

Version-Release number of selected component (if applicable):

sf17.2

How reproducible:

100%

Steps to Reproduce:
1. in iscsi storage with 2 hosts, create two domains located on different servers
2. create a vm from template and run it on hsm host
3. start LSM for the vm disk and after cloneImageStructure starts block connectivity to the dst storage domain in the hsm host only. 

Actual results:

1. engine rolls back but fails to stop/clear the task
2. debugging will be difficult since we keep getting 
'ArrayIndexOutOfBoundsException: -1'
3. we report success of LSM to the user although move did not end successfully and the disks were not synced.

Expected results:

1. engine should be able to clear the task
2. we should not repeat the failure over and over
3. we should not report success to the user. 

Additional info:

to clear the errors, task and the leftover lv's user needs to contact gss. 
debugging might be difficult because of the log exceptions.

Comment 1 Daniel Erez 2013-07-09 07:40:26 UTC

*** This bug has been marked as a duplicate of bug 966618 ***