Bug 1139643 - Wrong code status is returned in case of 'on' / 'off' power management failure
Summary: Wrong code status is returned in case of 'on' / 'off' power management failure
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 3.5.0
Assignee: Eli Mesika
QA Contact: sefi litmanovich
URL:
Whiteboard: infra
Depends On:
Blocks: 1129381 1141514
TreeView+ depends on / blocked
 
Reported: 2014-09-09 11:27 UTC by Eli Mesika
Modified: 2016-02-10 19:38 UTC (History)
11 users (show)

Fixed In Version: vdsm-4.16.6-1
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-16 13:39:49 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 32695 0 master MERGED vdsm: making PM 'on' and 'off' sync Never
oVirt gerrit 33469 0 ovirt-3.5 MERGED vdsm: making PM 'on' and 'off' sync Never

Description Eli Mesika 2014-09-09 11:27:15 UTC
Description of problem:
VDSM calls the fencing script async and return 0 if script invoked although the script may exit with an error code as described in he scenario below

Version-Release number of selected component (if applicable):


How reproducible: 100%

DC1 with H1 (with PM) and H2 on cluster C1
another host H3 on DC1 cluster C2

When we are blocking the communication with iptables from H2 to H1 PM
card and use the default proxy preferences (cluster, dc) a Restart
operation will always fail.


Steps to Reproduce:
1.Add H1 with PM and H2 to cluster C1 on data center DC1
2.Add H3 on cluster C2 on data center DC1
3. Block communication from H2 to H1 PM card IP
4.Restart H1 from UI (Power-Management->Restart)

Actual results:
H2 is selected as the proxy host for the stop operation and VDSM is returning that this operation is successful although the fencing script exit status is 1 
Therefor, we are waiting for a 'off' status which will never occur and the host is not rebooted  

Expected results:
VDSM should perform start/stop sync and return the correct script returned code in order that engine will know that H2 fails to perform the operation and will try to use H2 as a proxy for the failed operation

Additional info:

Comment 1 Eli Mesika 2014-09-09 11:30:54 UTC
> Expected results:
> VDSM should perform start/stop sync and return the correct script returned
> code in order that engine will know that H2 fails to perform the operation
> and will try to use H2 as a proxy for the failed operation

sorry, should be 

and will try to use H3 as a proxy for the failed operation

Comment 2 Eyal Edri 2014-10-07 07:16:47 UTC
this bug status was moved to MODIFIED before vdsm vt5 was built,
hence moving to on_qa, if this was mistake and the fix isn't in,
please contact rhev-integ

Comment 3 sefi litmanovich 2014-10-14 10:57:15 UTC
Verified with rhevm-3.5.0-0.14.beta.el6ev.noarch, vdsm-4.16.6-1.el6ev.x86_64 according to description.

functionality works as expected - one question though:

once a fence action fails with proxy A in fence flow1, and proxy B is selected and is successful, why isn't proxy B used all along flow1?
the fact that each action is attempted first with proxy A, fails, looks for proxy B and perform the action, caused the restart action to take 10 minutes! although a restart takes no longer then 3-5 minutes.

Comment 4 Eli Mesika 2014-10-18 22:10:29 UTC
(In reply to sefi litmanovich from comment #3)
Please see https://bugzilla.redhat.com/show_bug.cgi?id=1141514


Note You need to log in before you can comment on or make changes to this bug.