Bug 1477917 - Power management operations fail on the bladecenter agent
Summary: Power management operations fail on the bladecenter agent
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra
Version: 4.2.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Eli Mesika
QA Contact: Pavel Stehlik
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-03 08:17 UTC by Artyom
Modified: 2017-08-07 10:10 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-08-07 10:10:17 UTC
oVirt Team: Infra
Embargoed:


Attachments (Terms of Use)
engine and vdsm log (292.51 KB, application/zip)
2017-08-03 08:17 UTC, Artyom
no flags Details

Description Artyom 2017-08-03 08:17:20 UTC
Created attachment 1308622 [details]
engine and vdsm log

Description of problem:
Power management operations fail on the bladecenter agent.
For example running on test command on the bladecenter agent, the engine shows
Test failed: Internal JSON-RPC error


Version-Release number of selected component (if applicable):
ovirt-engine-4.2.0-0.0.master.20170731224404.git1758643.el7.centos.noarch
fence-agents-bladecenter-4.0.11-66.el7.x86_64
vdsm-4.20.1-280.gite07b232.el7.centos.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Configure power management on bladecenter host
2. Run test command
3.

Actual results:
Test failed: Internal JSON-RPC error

Expected results:
Test succeeds without any problems

Additional info:
On power management proxy I can see error
2017-08-03 11:00:15,715+0300 DEBUG (jsonrpc/7) [root] FAILED: <err> = 'Failed: Unable to obtain correct plug status or plug is not available\n\n\n'; <rc> = 1 (commands:94)
2017-08-03 11:00:15,716+0300 INFO  (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.fenceNode failed (error 1) in 0.80 seconds (__init__:592)

Comment 1 Martin Perina 2017-08-03 12:13:23 UTC
The important part is:

 'Failed: Unable to obtain correct plug status or plug is not available\n\n\n'

which mean that your fence agent settings are incorrect. So are you sure that test using the same settings was successful in previous oVirt version (as you have entered this is a regression)? If not sure or using the fence agent for the 1st time, please take a look at you bladecenter settings and consult fence_bladecenter man page to find out all options which are needed to use that fence device.

Comment 2 Artyom 2017-08-07 07:14:08 UTC
I rechecked the problem, and it was some problem with the host power management after eng-ops help, all works as expected and "test" command returns correct message:
"Test successful: power on"

But I believe we still can improve the messaging under the UI, something like
"Host power management not reachable" instead of "Test failed: Internal JSON-RPC error".

What do you think Eli?

Comment 3 Eli Mesika 2017-08-07 09:04:20 UTC
(In reply to Artyom from comment #2)

> What do you think Eli?

The UI does not know that the "Host power management not reachable", it just reflects the message coming from the engine.

Since this is a corner case in which you has a physical problem on the fencing device I think that the message is OK.

Anyway, in any case the power management fails via the engine, the first step we request the reporter to do is to try to access the fencing agents directly via the fence-agents package scripts in the host /usr/sbin directory and in this case he will see the exact cause 

We have to understand here that we have a call from UX=>engine=>VDSM=>fence script, so in any problem we should investigate starting on the script end point.


Note You need to log in before you can comment on or make changes to this bug.