Bug 1219620

Summary: upgrade to oVirt 3.5.2 breaks power management (at least idrac7)
Product: [Retired] oVirt Reporter: jas
Component: ovirt-engine-coreAssignee: Eli Mesika <emesika>
Status: CLOSED CURRENTRELEASE QA Contact: movciari
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.5CC: bugs, ecohen, emesika, gklein, lsurette, mgrac, movciari, oourfali, rbalakri, yeylon
Target Milestone: ---   
Target Release: 3.5.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-09-03 13:54:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1221312    

Description jas 2015-05-07 19:53:31 UTC
Description of problem:

After upgrading from engine running CentOS 7 and ovirt node 3.5.1 to engine running on CentOS 7.1 and CentOS 7.1 hosts as "nodes", fencing with idrac7 no longer works.

Steps to Reproduce:
1. Add 2 new hosts to engine configured with idrac7 power management (no longer need to specify options for idrac7 with 3.5.2 - thank you.).
2. Click "Test" button on Power Management page.
3. "Test Succeeded, unknown".

It didn't really succeed at all.

engine:

2015-05-07 15:19:37,585 INFO  [org.ovirt.engine.core.bll.FenceExecutor] (ajp--127.0.0.1-8702-1) Using Host virt1 from cluster EECS-Primary as proxy to execute Status command on Host 
2015-05-07 15:19:37,587 INFO  [org.ovirt.engine.core.bll.FenceExecutor] (ajp--127.0.0.1-8702-1) Executing <Status> Power Management command, Proxy Host:virt1, Agent:ipmilan, Target Host:, Management IP:virt2-idrac, User:root, Options:privlvl=OPERATOR,lanplus,delay=10, Fencing policy:null
2015-05-07 15:19:37,602 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (ajp--127.0.0.1-8702-1) START, FenceVdsVDSCommand(HostName = virt1, HostId = 6a638c0a-0a48-4e77-81b5-f89ae572cf3e, targetVdsId = 8df66929-0cb7-4347-b004-c5531178d024, action = Status, ip = virt2-idrac, port = , type = ipmilan, user = root, password = ******, options = 'privlvl=OPERATOR,lanplus,delay=10', policy = 'null'), log id: 682d66b2
2015-05-07 15:19:53,876 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-1) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Power Management test failed for Host virt2.Done
2015-05-07 15:19:53,876 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (ajp--127.0.0.1-8702-1) FINISH, FenceVdsVDSCommand, return: Test Succeeded, unknown, log id: 682d66b2

on virt1 vdsm log:

hread-11121::DEBUG::2015-05-07 15:19:53,864::API::1164::vds::(fence) rc 1 inp agent=fence_ipmilan
ipaddr=virt2-idrac
login=root
action=status
passwd=XXXX
privlvl=OPERATOR
delay=10
lanplus out [] err ['Failed: Unable to obtain correct plug status or plug is not available', '', '']
Thread-11121::DEBUG::2015-05-07 15:19:53,864::API::1235::vds::(fenceNode) rc 1 in agent=fence_ipmilan
ipaddr=virt2-idrac
login=root
action=status
passwd=XXXX
privlvl=OPERATOR
delay=10
lanplus out [] err ['Failed: Unable to obtain correct plug status or plug is not available', '', '']

We can see that engine was able to contact virt1.
virt1 was able to contact virt2-idrac.
Password is correct.

I know that fence_ipmilan works successfully because I can go to virt2 and run the following:

/usr/sbin/fence_ipmilan -a virt1-idrac -l root -o status -p 'my-password' -L OPERATOR -P

The result is "Status: ON" as it was with 3.5.1.

Likewise, I can make virt1 verify virt2-idrac manually:

/usr/sbin/fence_ipmilan -a virt2-idrac -l root -o status -p 'my-password' -L OPERATOR -P 

Again, "Status: ON".

I've seen this "Test Succeeded, unknown" previously while getting the options right for idrac7 with 3.5.1.  The error needs to be corrected as well.  In this case, it was a failure.

Comment 2 Eli Mesika 2015-05-10 08:40:52 UTC
Marek, is there any change in CentOS 7.1 fence-agents RPM that requires the lanplus parameter to be defined with lanplus=1 or lanplus=0 rather than the  CentOS 7  fence-agents RPM that works OK with only lanplus 

Please reply to Comment 1

Comment 3 Marek Grac 2015-05-11 07:42:56 UTC
@Eli:

yes, it is very likely. Situation with 'lanplus' instead of 'lanplus=X' was not intended. Each option should have a value.

Comment 4 Oved Ourfali 2015-05-11 10:07:44 UTC
Marek - you must understand this causes a lot of regressions for us, and it wasn't communicated to us.
Please consider supporting both, to support old usages.
I'll open a bug on 7.1.

Comment 5 Marek Grac 2015-05-11 10:20:40 UTC
@Oved:

I understand but this was never meant to be used, it worked only as a side effect and only in this case&fence agent afaik. Are you using 'ssl' or 'secure' in other agents with =1 or without?

Comment 6 Oved Ourfali 2015-05-11 10:22:00 UTC
(In reply to Marek Grac from comment #5)
> @Oved:
> 
> I understand but this was never meant to be used, it worked only as a side
> effect and only in this case&fence agent afaik. Are you using 'ssl' or
> 'secure' in other agents with =1 or without?


I understand it wasn't the original intent, but it was already supported and done.

As for your question, Eli?

Comment 7 Eli Mesika 2015-05-11 10:56:21 UTC
(In reply to Oved Ourfali from comment #6)

> As for your question, Eli?

ssl & secure are passed without a value . i.e. ssl, secure not ssl=x secure=y 

is that also a problem ?

Comment 9 Eli Mesika 2015-05-31 09:01:01 UTC
putting comment 1 to private since it has a plain text password by mistake

Comment 10 Sandro Bonazzola 2015-07-02 06:51:28 UTC
Since oVirt 3.5.4 RC1 has been released, please ensure that the fix is included in the build and move the bug to ON_QA accordingly.

Comment 11 Sandro Bonazzola 2015-09-03 13:54:54 UTC
This is an automated message.
oVirt 3.5.4 has been released on September 3rd 2015 and should include the fix for this BZ. Moving to closed current release.