Bug 876235

Summary:	PRD32 - Do not force fencing proxy to be in UP status
Product:	Red Hat Enterprise Virtualization Manager	Reporter:	Eli Mesika <emesika>
Component:	ovirt-engine	Assignee:	Eli Mesika <emesika>
Status:	CLOSED ERRATA	QA Contact:	Tareq Alayan <talayan>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	3.1.0	CC:	acathrow, bazulay, dyasny, emesika, iheim, lpeer, pablo.iranzo, pstehlik, Rhev-m-bugs, yeylon, ykaul, yzaslavs
Target Milestone:	---	Keywords:	Improvement
Target Release:	3.2.0	Flags:	dyasny: Triaged+
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	infra
Fixed In Version:		Doc Type:	Enhancement
Doc Text:	Previously, a host could only be selected as a proxy for fencing operations if it was in an "Up" state. Now, hosts in an "Up" state are still preferred by the proxy selection algorithm, but hosts in any state other than "Restarting" or "Non-Operational" are also considered as proxy candidates.	Story Points:	---
Clone Of:
Clones:	889096 (view as bug list)		Environment:
Last Closed:	2013-06-10 21:20:13 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	Infra	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	869309, 889096, 915537

Description Eli Mesika 2012-11-13 15:45:27 UTC

Description of problem:
When a Host that has PM configured becomes non-responsive we will try to restart it using its PM agent.
In order to do that we need a Host in the system that can function as a proxy to the fencing commands sent to the problematic Host.
Current proxy search implementation is looking  for the first Host that is in UP status in problematic Host DC.
We should look for the first Host that is in UP status in problematic Host DC and if not found try to take any other Host since the Host may be in other statuses as Maintenance but still can serve as a proxy.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.Have 2 Hosts in a DC host1 & host2
2.Configure and test PM for host1
3.Activate host1
4.Put host2 in Maintenance
5.Try to restart host1 from the UI 
  
Actual results:
You get a failure on canDoAction since you have no Host proxy in UP staus

Expected results:
host2 should be used to fence host1 even it is in Maintenance 


Additional info:

Comment 1 Eli Mesika 2012-11-14 20:47:46 UTC

http://gerrit.ovirt.org/#/c/9251/

Comment 2 Eli Mesika 2012-11-16 08:54:33 UTC

fixed in commit: ca10cc1

Comment 3 Tareq Alayan 2013-02-11 14:14:48 UTC

verified.

Comment 4 Cheryn Tan 2013-04-03 06:51:19 UTC

This bug is currently attached to errata RHEA-2013:14491. If this change is not to be documented in the text for this errata please either remove it from the errata, set the requires_doc_text flag to minus (-), or leave a "Doc Text" value of "--no tech note required" if you do not have permission to alter the flag.

Otherwise to aid in the development of relevant and accurate release documentation, please fill out the "Doc Text" field above with these four (4) pieces of information:

* Cause: What actions or circumstances cause this bug to present.

* Consequence: What happens when the bug presents.

* Fix: What was done to fix the bug.

* Result: What now happens when the actions or circumstances above occur. (NB: this is not the same as 'the bug doesn't present anymore')

Once filled out, please set the "Doc Type" field to the appropriate value for the type of change made and submit your edits to the bug.

For further details on the Cause, Consequence, Fix, Result format please refer to:

https://bugzilla.redhat.com/page.cgi?id=fields.html#cf_release_notes

Thanks in advance.

Comment 5 errata-xmlrpc 2013-06-10 21:20:13 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0888.html

Comment 6 Pablo Iranzo Gómez 2014-02-05 11:08:41 UTC

Eli,

I've seen the algorithm to be using hosts in 'Maintenance', this could also mean hosts that are not reachable (because of other operations). 

I'm not sure if those hosts are elected as the last ones available, but I think that they should be lowered in the list of host available for doing fence operations against anothers.

In my setup, I've 6 hypervisors in two clusters, and when enabling one of the hypervisors with power management, it was never starting, because it first tried hosts in the same cluster (which I moved to hosts in DC using the ordered list), and all hosts in the cluster were in maintenance.

Should I raise an RFE for this?

Thanks,
Pablo

Comment 7 Eli Mesika 2014-02-05 11:29:35 UTC

Currently we are trying to get a proxy that is in UP first in the same cluster, if this fails ww will try to get any other host except those who have network errors , if this also fails , the same is done for the host DC 

You may open a RFE for that

Comment 8 Pablo Iranzo Gómez 2014-02-05 13:50:51 UTC

Eli, Created as 1061722  
Thanks!
Pablo