Bug 1044092 - FenceVdsVDSCommand successfully completes even if the underlying fence action to stop, start or reboot a host fails.
Summary: FenceVdsVDSCommand successfully completes even if the underlying fence action...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: ---
: 3.4.0
Assignee: Eli Mesika
QA Contact:
URL:
Whiteboard: infra
Depends On:
Blocks: 1044088
TreeView+ depends on / blocked
 
Reported: 2013-12-17 19:22 UTC by Lee Yarwood
Modified: 2018-12-06 15:37 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-12-19 09:52:50 UTC
oVirt Team: Infra
Target Upstream Version:


Attachments (Terms of Use)

Description Lee Yarwood 2013-12-17 19:22:18 UTC
Description of problem:
FenceVdsVDSCommand successfully completes even if the underlying fence action to stop, start or reboot a host fails. 

Version-Release number of selected component (if applicable):
rhevm-3.2.3-0.43.el6ev.noarch

How reproducible:
Always.

Steps to Reproduce:
1. Remove all power to an active host, including any fence agents that are configured.
2. Engine will attempt to fence the host in question.
3. The first fence action to stop power will succeed even if the fence command executed by the proxy host fails.

Actual results:
FenceVdsVDSCommand completes successfully.

Expected results:
FenceVdsVDSCommand should fail and reflect the failure of the fence command on the proxy.

Additional info:

Comment 4 Eli Mesika 2013-12-18 09:42:00 UTC
please specify what do exactly do in (taken from BZ description)

1. Remove all power to an active host, including any fence agents that are configured.

I need exact steps in UI/API in order to reproduce.

NOTE:

You wrote :
***************
Reading the code we don't actually wait around for the return value when calling the off action thus incorrectly report everything as fine when you can see a return code of 1 above.
***************

Yes, the vdsm fenceNode is fire & forget , so success is returned on being able to run the corresponding /usr/sbin/fence_<agent> script , we track the operation success by pooling the Host status via the PM agent

Comment 5 Eli Mesika 2013-12-18 09:43:14 UTC
please specify what do exactly do in (taken from BZ description)

1. Remove all power to an active host, including any fence agents that are configured.

I need exact steps in UI/API in order to reproduce.

NOTE:

You wrote :
***************
Reading the code we don't actually wait around for the return value when calling the off action thus incorrectly report everything as fine when you can see a return code of 1 above.
***************

Yes, the vdsm fenceNode is fire & forget , so success is returned on being able to run the corresponding /usr/sbin/fence_<agent> script , we track the operation success by pooling the Host status via the PM agent

Comment 6 Julio Entrena Perez 2013-12-18 09:45:10 UTC
(In reply to Eli Mesika from comment #4)
> please specify what do exactly do in (taken from BZ description)
> 
> 1. Remove all power to an active host, including any fence agents that are
> configured.
> 
> I need exact steps in UI/API in order to reproduce.
> 
Just pull all power cords from the server.
This will remove power from the server but also from the embedded IPMI fencing device (e.g. HP iLO) thus preventing other hosts from reaching the fence device.


Note You need to log in before you can comment on or make changes to this bug.