Bug 876235

Summary: PRD32 - Do not force fencing proxy to be in UP status
Product: Red Hat Enterprise Virtualization Manager Reporter: Eli Mesika <emesika>
Component: ovirt-engineAssignee: Eli Mesika <emesika>
Status: CLOSED ERRATA QA Contact: Tareq Alayan <talayan>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.1.0CC: acathrow, bazulay, dyasny, emesika, iheim, lpeer, pablo.iranzo, pstehlik, Rhev-m-bugs, yeylon, ykaul, yzaslavs
Target Milestone: ---Keywords: Improvement
Target Release: 3.2.0Flags: dyasny: Triaged+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Enhancement
Doc Text:
Previously, a host could only be selected as a proxy for fencing operations if it was in an "Up" state. Now, hosts in an "Up" state are still preferred by the proxy selection algorithm, but hosts in any state other than "Restarting" or "Non-Operational" are also considered as proxy candidates.
Story Points: ---
Clone Of:
: 889096 (view as bug list) Environment:
Last Closed: 2013-06-10 21:20:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 869309, 889096, 915537    

Description Eli Mesika 2012-11-13 15:45:27 UTC
Description of problem:
When a Host that has PM configured becomes non-responsive we will try to restart it using its PM agent.
In order to do that we need a Host in the system that can function as a proxy to the fencing commands sent to the problematic Host.
Current proxy search implementation is looking  for the first Host that is in UP status in problematic Host DC.
We should look for the first Host that is in UP status in problematic Host DC and if not found try to take any other Host since the Host may be in other statuses as Maintenance but still can serve as a proxy.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.Have 2 Hosts in a DC host1 & host2
2.Configure and test PM for host1
3.Activate host1
4.Put host2 in Maintenance
5.Try to restart host1 from the UI 
  
Actual results:
You get a failure on canDoAction since you have no Host proxy in UP staus

Expected results:
host2 should be used to fence host1 even it is in Maintenance 


Additional info:

Comment 1 Eli Mesika 2012-11-14 20:47:46 UTC
http://gerrit.ovirt.org/#/c/9251/

Comment 2 Eli Mesika 2012-11-16 08:54:33 UTC
fixed in commit: ca10cc1

Comment 3 Tareq Alayan 2013-02-11 14:14:48 UTC
verified.

Comment 4 Cheryn Tan 2013-04-03 06:51:19 UTC
This bug is currently attached to errata RHEA-2013:14491. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.

* Consequence: What happens when the bug presents.

* Fix: What was done to fix the bug.

* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes

Thanks in advance.

Comment 5 errata-xmlrpc 2013-06-10 21:20:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0888.html

Comment 6 Pablo Iranzo Gómez 2014-02-05 11:08:41 UTC
Eli,

I've seen the algorithm to be using hosts in 'Maintenance', this could also mean hosts that are not reachable (because of other operations). 

I'm not sure if those hosts are elected as the last ones available, but I think that they should be lowered in the list of host available for doing fence operations against anothers.

In my setup, I've 6 hypervisors in two clusters, and when enabling one of the hypervisors with power management, it was never starting, because it first tried hosts in the same cluster (which I moved to hosts in DC using the ordered list), and all hosts in the cluster were in maintenance.

Should I raise an RFE for this?

Thanks,
Pablo

Comment 7 Eli Mesika 2014-02-05 11:29:35 UTC
Currently we are trying to get a proxy that is in UP first in the same cluster, if this fails ww will try to get any other host except those who have network errors , if this also fails , the same is done for the host DC 

You may open a RFE for that

Comment 8 Pablo Iranzo Gómez 2014-02-05 13:50:51 UTC
Eli, Created as 1061722  
Thanks!
Pablo