Bug 1073896

Summary: [RHEVM] error displayed when fencing test fails is not meaningful
Product: Red Hat Enterprise Virtualization Manager Reporter: Evgheni Dereveanchin <ederevea>
Component: ovirt-engineAssignee: Eli Mesika <emesika>
Status: CLOSED CURRENTRELEASE QA Contact: Tareq Alayan <talayan>
Severity: unspecified Docs Contact:
Priority: medium    
Version: 3.3.0CC: acathrow, ecohen, emesika, gklein, iheim, lpeer, oourfali, pstehlik, Rhev-m-bugs, scohen, slitmano, yeylon
Target Milestone: ---   
Target Release: 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: org.ovirt.engine-root-3.4.0-14 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-12 14:06:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Evgheni Dereveanchin 2014-03-07 12:35:08 UTC
Description of problem:
when no host is available to perform a fencing test, an error is displayed which is not meaningful

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. attach two hosts to manager, configure fencing on both, ensure it works
2. power off one host (only one remains)
3. select "edit" to see host settings host settings and test power management

Actual results:
the following error is displayed:

Test Failed, VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.net.SocketTimeoutException: connect timed out (Failed with error VDS_NETWORK_ERROR and code 5022)

Expected results:

Display a meaningful error explaining that there's only one host active, and it cannot be used to fence itself.

Comment 1 Oved Ourfali 2014-03-24 13:39:23 UTC
Two thoughts/options here:

1. We can check in advance if there is only one host, and if so fail with a better error.

2. We can do the test from the host itself. It is true that it isn't the runtime use-case, but it will show us that the power management defined is available. What it won't test is whether there are other available hosts that can fence using these power management definitions. However, the same problem can occur if the host that did the fencing test is no longer in the system, and other hosts can't access the defined power management.

Comment 2 Eli Mesika 2014-03-25 15:22:31 UTC
(In reply to Oved Ourfali from comment #1)
> Two thoughts/options here:
> 
> 1. We can check in advance if there is only one host, and if so fail with a
> better error.

The problem here is that the host was powered off manually, therefor it had changed state to "connecting..." and in this state the host is considered as a candidate to be a proxy host according to the proxy selection algorithm.
If the host was powered off using the PM card , it was gone to DOWN and then we wouldn't find any host for proxy operations and get the correct message.  

I would suggest here to CLOSE NOTABUG

> 
> 2. We can do the test from the host itself. It is true that it isn't the
> runtime use-case, but it will show us that the power management defined is
> available. What it won't test is whether there are other available hosts
> that can fence using these power management definitions. However, the same
> problem can occur if the host that did the fencing test is no longer in the
> system, and other hosts can't access the defined power management.

This has drawback of a possibility the TEST succeed while the operation from the non-responding treatment fail, so , I am not sure if this worth t6he effort and if it is I think it deserves its own RFE

Comment 6 Eli Mesika 2014-03-27 11:50:20 UTC
*** Bug 1080905 has been marked as a duplicate of this bug. ***

Comment 7 sefi litmanovich 2014-04-30 14:14:00 UTC
Verified on rhevm-3.4.0-0.16.rc.el6ev.noarch.

the functionality works, but error message still not changed - will open a new bz for that.

in terms of functionality - 
1. added two host to same dc-cluster - 1 with pm, 1 without.
2. shutdown the host without pm manually - becomes non-responsive
3. test pm functionality fails with error message:

Test Failed, VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.net.SocketTimeoutException: connect timed out (Failed with error VDS_NETWORK_ERROR and code 5022)

Comment 8 sefi litmanovich 2014-04-30 14:43:15 UTC
Sorry, verification wasn't good, I will reproduce and verify once again

Comment 9 sefi litmanovich 2014-04-30 14:54:46 UTC
Verified on rhevm-3.4.0-0.16.rc.el6ev.noarch.

added two hosts with pm to the same dc-cluster.
put one of the down, status - down.
tested pm functionality on the other host and received the message:

Test Failed, There is no other host in the data center that can be used to test the power management settings.

Comment 10 Itamar Heim 2014-06-12 14:06:38 UTC
Closing as part of 3.4.0