Bug 834289

Summary: Can't activate a host while another host in contending
Product: Red Hat Enterprise Virtualization Manager Reporter: Simon Grinberg <sgrinber>
Component: ovirt-engineAssignee: mkublin <mkublin>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Haim <hateya>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.0.3CC: acathrow, amureini, bazulay, dyasny, iheim, lpeer, Rhev-m-bugs, yeylon, ykaul, yzaslavs
Target Milestone: ---   
Target Release: 3.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-07-15 11:43:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Simon Grinberg 2012-06-21 12:39:53 UTC
Description of problem:
When a Host is contending for SPM role, any attempt to activate another hosts fails and goes to non operational 

Version-Release number of selected component (if applicable):


How reproducible:
Always 

Steps to Reproduce:
1. Place all the hosts into maintenance 
2. Activate a hosts (let's call it hosts A)
3. Wait until it's status change to contending 
4. Activate another host (let's cal it host B)
  
Actual results:
Host B becomes non operational with a message saying that the connection to storage failed

Expected results:
Host activation is successful 

Additional info:
This is critical in power failure recovery scenario. Assume that power fails and then returns. All the system starts at once.
If all the hosts woke up before RHEV Manager all is fine, if not then the first host that RHEV Manager detects starts contenting. All the other hosts the wake up during this time are moved to non operational, leaving the system with only part of the hosts active after power failure. 

This what happened to me on my setup.

Comment 1 Simon Grinberg 2012-06-21 12:40:52 UTC
Version latest from RHN (3.0.4/3.0.3 for engine)

Comment 5 mkublin 2012-07-15 11:43:57 UTC
I did not success to reproduce, I don't know what can be a reason and I can not solve it. Closing as insufficient data

Comment 6 Haim 2012-07-15 11:54:51 UTC
(In reply to comment #5)
> I did not success to reproduce, I don't know what can be a reason and I can
> not solve it. Closing as insufficient data

kublin, did you contact Simon ? Simon, if you have a reproducer, please show kublin and move to assigned.