Bug 967490

Summary: engine: AutoRecovery starts a host and host gets stuck in unassigned because engine fails to clean cloneImageStructure with 'java.lang.ArrayIndexOutOfBoundsException: -1'
Product: Red Hat Enterprise Virtualization Manager Reporter: Dafna Ron <dron>
Component: ovirt-engineAssignee: Ayal Baron <abaron>
Status: CLOSED DUPLICATE QA Contact: Haim <hateya>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.2.0CC: acathrow, iheim, jkt, lpeer, Rhev-m-bugs, scohen, tnisan, yeylon
Target Milestone: ---Keywords: Triaged
Target Release: 3.3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-09 11:52:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Dafna Ron 2013-05-27 09:42:38 UTC
Created attachment 753532 [details]
logs

Description of problem:

I ran a vm on the hsm and blocked connectivity from the hsm to the storage domain right when we finish the createVolume task.

when engine tries to clean cloneImageStructure we fail with  'java.lang.ArrayIndexOutOfBoundsException: -1'

after the host became non-operational I restored the connectivity to the storage from the host and after a few minutes AutoRecovery tries to activate the host. 

the host is stuck in unassigned until I restarted ovirt-engine service, even host reboot does not release it. 

Version-Release number of selected component (if applicable):

sf17.1
vdsm-4.10.2-21.0.el6ev.x86_64

How reproducible:

100%

Steps to Reproduce:
1. in iscsi storage with multiple domains and export domain, create a vm from template and run it on the hsm host
2. when FINISH, CreateSnapshotVDSCommand is logged in the engine log block connectivity to the storage domain from the hsm host using iptables 
3. when host becomes non-operational restore the connectivity to the storage

Actual results:

engine fails to clear task cloneImageStructure with 'java.lang.ArrayIndexOutOfBoundsException: -1'
when we restore the connectivity to the storage from the hsm and AutoRecovery tries to activate the host, host is stuck in unassigned until we restart engine and clear the task. 


Expected results:

host should not be stuck in unassigned because of ArrayIndexOutOfBoundsException

Additional info: logs

Comment 1 Tal Nisan 2013-07-09 11:52:14 UTC

*** This bug has been marked as a duplicate of bug 966153 ***