1061722 – [RFE] Depriorize hosts in non 'up' status for fencing

Bug 1061722 - [RFE] Depriorize hosts in non 'up' status for fencing

Summary: [RFE] Depriorize hosts in non 'up' status for fencing

Keywords:
Status:	CLOSED DUPLICATE of bug 961753
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-engine
Sub Component:
Version:	3.3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	3.5.0
Assignee:	Eli Mesika
QA Contact:
Docs Contact:
URL:
Whiteboard:	infra
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2014-02-05 13:50 UTC by Pablo Iranzo Gómez
Modified:	2016-02-10 19:36 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed:	2014-04-13 11:58:18 UTC
oVirt Team:	Infra
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Pablo Iranzo Gómez 2014-02-05 13:50:05 UTC

Description of problem:

As per BZ 876235, hosts are not forced to be in UP state to be used as a fencing  proxy.

Sometimes, host maybe on HW maintenance, so being on Maintenance could mean that host can't be used (as it's not reachable).

In my setup, I've 6 hypervisors in two clusters, and when enabling one of the hypervisors with power management, it was never starting, because it first tried hosts in the same cluster (which I moved to hosts in DC using the ordered list), and all hosts in the cluster were in maintenance.

Version-Release number of selected component (if applicable):


How reproducible:

Steps to Reproduce:
1. Have two cluster
2. Put all hosts in one cluster in maintenance
3. Leave  1 host up and remaining ones in maintenance
4. Power off al hosts but that 1 in 'up'
5. Try to activate the hosts, to force 'power management' to come in and try to power on

Actual results:
It will be trying hosts in maintenance to power up the other ones, but as the ones in maintenance are powered off, it will fail several times

Expected results:
Use first the hosts in up status according to preferences (dc/cluster), and if none available, try the maintenance ones.

Additional info:

Comment 2 Eli Mesika 2014-02-05 14:40:43 UTC

(In reply to Pablo Iranzo Gómez from comment #0)
Scenario is not clear , see below 
> How reproducible:
> 
> Steps to Reproduce:
lets name then C1 and C2 
> 1. Have two cluster
> 2. Put all hosts in one cluster in maintenance
OK , that's for C1
> 3. Leave  1 host up and remaining ones in maintenance
please specify in which cluster
> 4. Power off al hosts but that 1 in 'up'
please specify in which cluster

> 5. Try to activate the hosts, to force 'power management' to come in and try
> to power on
please specify in which cluster

> 
> Actual results:
> It will be trying hosts in maintenance to power up the other ones, but as
> the ones in maintenance are powered off, it will fail several times
> 
> Expected results:
> Use first the hosts in up status according to preferences (dc/cluster), and
> if none available, try the maintenance ones.

The current mechanism search on each cluster/dc for hosts in UP first and only if this fails in other statuses , so this is how it behaves now
From our email session I understood that you want the Hosts in Maintenance to be in low priority 


> 
> Additional info:

Comment 3 Pablo Iranzo Gómez 2014-02-05 15:06:51 UTC

for 3: C2
for 4: c1 and c2 (but host in status 'up'): For Power Off I mean to shutdown machine from OS (init 0) not from RHEV (which changes status to "Down).

Actual status at 4) should be: All hosts Powered off except one host in C2, status for all hosts powered down: Maintenance

for 5: any host in C1


>The current mechanism search on each cluster/dc for hosts in UP first and only >if this fails in other statuses , so this is how it behaves now
>From our email session I understood that you want the Hosts in Maintenance to >be in low priority 

Yes, in this case, with all hosts but '1' on maintenance, when trying to activate any other host in C1 or C2 will automatically select host '1' to do the Power Management check operation.

Did I made it more clear?

Thanks!
Pablo

Comment 4 Eli Mesika 2014-02-05 15:43:03 UTC

(In reply to Pablo Iranzo Gómez from comment #3)
> for 3: C2
> for 4: c1 and c2 (but host in status 'up'): For Power Off I mean to shutdown
> machine from OS (init 0) not from RHEV (which changes status to "Down).
> 
> Actual status at 4) should be: All hosts Powered off except one host in C2,
> status for all hosts powered down: Maintenance
> 
> for 5: any host in C1
> 
> 
> >The current mechanism search on each cluster/dc for hosts in UP first and only >if this fails in other statuses , so this is how it behaves now
> >From our email session I understood that you want the Hosts in Maintenance to >be in low priority 
> 
> Yes, in this case, with all hosts but '1' on maintenance, when trying to
> activate any other host in C1 or C2 will automatically select host '1' to do
> the Power Management check operation.
> 
> Did I made it more clear?

If this is the case , then it is working just as designed 
When you fence a Host in C1 it will try to get a proxy from C1 , since there is no such , it will go to the DC level and find your UP Host in C2

If you want this to look first on UP host on the DC  you can edit the Source field in the Host PM TAB to be dc and then cluster , this will look-up for all hosts in UP in the DC first then all hosts in other statuses in the DC and then do the same for the cluster level

We are going with default cluster.dc since there is more chance you have connectivity within the cluster 

So, as far as I underhand the case, this is not a bug unless you want the mechanism to skip hosts in maintenance or enable to configure this behavior via a flag 




> 
> Thanks!
> Pablo

Comment 5 Pablo Iranzo Gómez 2014-02-05 16:42:12 UTC

Yes, that's the idea, having a way to skip hosts on maintenance and if none available at cluster, go to DC (and if not, then, fail back to hosts on maintenance, that's why I asked about removing priority of hosts to be used as fence proxy.)

Thanks,
Pablo

Comment 6 Eli Mesika 2014-04-13 11:58:18 UTC

Will be resolved as part of 961753
Host in maintenance is a legal candidate as a proxy 
When we fail to use a proxy we will try to use another proxy

*** This bug has been marked as a duplicate of bug 961753 ***

Note You need to log in before you can comment on or make changes to this bug.