Bug 1318954 - HP ILO2 , fence not working
Summary: HP ILO2 , fence not working
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: ---
Hardware: x86_64
OS: Linux
unspecified
urgent vote
Target Milestone: ---
: ---
Assignee: Martin Perina
QA Contact: Petr Matyáš
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-18 09:07 UTC by Ovirt-User
Modified: 2016-04-13 14:17 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-04-13 14:17:23 UTC
oVirt Team: Infra
rule-engine: planning_ack?
rule-engine: devel_ack?
rule-engine: testing_ack?


Attachments (Terms of Use)

Description Ovirt-User 2016-03-18 09:07:58 UTC
My ovirt environment:
One ovirt Management Portal, and two ovirt node(Centos 6.6 + VDSM).
when one node was fail, the alive node cannot fence the failed node sucessfully.
the error message log was as below.

======================
Thread-677::DEBUG::2016-03-18 16:44:21,043::stompReactor::163::yajsonrpc.StompServer::(send) Sending response
JsonRpc (StompReactor)::DEBUG::2016-03-18 16:44:21,089::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2016-03-18 16:44:21,091::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request
Thread-678::DEBUG::2016-03-18 16:44:21,091::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Task.clear' in bridge with {'taskID': 'f6c04b19-32ac-48e2-9a94-52913529b792'}
Thread-678::DEBUG::2016-03-18 16:44:21,094::task::595::Storage.TaskManager.Task::(_updateState) Task=`cbe567e6-72ae-4e77-9c1d-58414b5c16ed`::moving from state init -> state preparing
Thread-678::INFO::2016-03-18 16:44:21,094::logUtils::44::dispatcher::(wrapper) Run and protect: clearTask(taskID='f6c04b19-32ac-48e2-9a94-52913529b792', spUUID=None, options=None)
Thread-678::DEBUG::2016-03-18 16:44:21,095::taskManager::171::Storage.TaskManager::(clearTask) Entry. taskID: f6c04b19-32ac-48e2-9a94-52913529b792
Thread-676::DEBUG::2016-03-18 16:44:21,101::utils::759::root::(execCmd) FAILED: <err> = 'Unable to connect/login to fencing device\n'; <rc> = 1
Thread-676::DEBUG::2016-03-18 16:44:21,101::API::1164::vds::(fence) rc 1 inp agent=fence_ilo
ipaddr=172.21.151.119
login=isadmin
action=off
passwd=XXXX
ipport=22
ssl=no out [] err ['Unable to connect/login to fencing device']
Thread-676::DEBUG::2016-03-18 16:44:21,102::API::1235::vds::(fenceNode) rc 1 in agent=fence_ilo
ipaddr=172.21.151.119
login=isadmin
action=off
passwd=XXXX
ipport=22
ssl=no out [] err ['Unable to connect/login to fencing device']
===================================

My hardware was HP DL580 G5 (ilo2)

1. I already install the newest fence package - "fence-agents-3.1.5-48.el6_6.3.x86_64"
2. ilo2 firmware version : 2.0.9

How reproducible:


Steps to Reproduce:
1.shutdown one host
2.the second host cannot fence fail host sucessfully.


Actual results:

Unable to connect/login to fencing device

Expected results:


Additional info:

I manually test the fence funtion on alive host.

[root@hqalabkvm02 vdsm]# fence_ilo2 --ip 172.21.151.119 --username=isadmin --password itaq12ws -o status
Status: OFF

you can find that I can get the ilo status of failed host, but actually I cannot fence it sucessfully.

what log should I should provide ?

Comment 1 Red Hat Bugzilla Rules Engine 2016-03-18 09:08:02 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 2 Yaniv Kaul 2016-03-20 07:10:55 UTC
The command above (from VDSM log) shows that you've tried to use ilo, not ilo2. On the manual run, you've tried to use ilo2. Can you re-verify oVirt with ilo2?

Comment 3 Ovirt-User 2016-03-21 01:47:49 UTC
Hi Kaul, thanks for your response.
I make sure I had setup to use ilo2 on ovirt "Power Management" page, and I also test it, but I got the "test succeeded, unknown" result.
Then I check the vdsm log as below.
I don't know why it still used fence_ilo not fence_ilo2.
If I should provide any logs or error screen shot, you could contact with me.

=============================================
Dummy-232::DEBUG::2016-03-21 09:30:35,930::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) dd if=/rhev/data-center/00000002-0002-0002-0002-00000000006f/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None)
Dummy-232::DEBUG::2016-03-21 09:30:35,999::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.0129347 s, 79.2 MB/s\n'; <rc> = 0
Thread-117227::DEBUG::2016-03-21 09:30:35,999::utils::759::root::(execCmd) FAILED: <err> = 'Unable to connect/login to fencing device\n'; <rc> = 1
Thread-117227::DEBUG::2016-03-21 09:30:36,003::API::1164::vds::(fence) rc 1 inp agent=fence_ilo
ipaddr=172.21.151.119
login=isadmin
action=status
passwd=XXXX
ipport=22
ssl=no out [] err ['Unable to connect/login to fencing device']
Thread-117227::DEBUG::2016-03-21 09:30:36,005::API::1235::vds::(fenceNode) rc 1 in agent=fence_ilo
ipaddr=172.21.151.119
login=isadmin
action=status
passwd=XXXX
ipport=22
ssl=no out [] err ['Unable to connect/login to fencing device']
Thread-117227::DEBUG::2016-03-21 09:30:36,008::stompReactor::163::yajsonrpc.StompServer::(send) Sending response
JsonRpc (StompReactor)::DEBUG::2016-03-21 09:30:36,493::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2016-03-21 09:30:36,496::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request
Thread-117228::DEBUG::2016-03-21 09:30:36,500::stompReactor::163::yajsonrpc.StompServer::(send) Sending response
Dummy-232::DEBUG::2016-03-21 09:30:38,027::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) dd if=/rhev/data-center/00000002-0002-0002-0002-00000000006f/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None)
Dummy-232::DEBUG::2016-03-21 09:30:38,095::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.0128294 s, 79.8 MB/s\n'; <rc> = 0
JsonRpc (StompReactor)::DEBUG::2016-03-21 09:30:39,520::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2016-03-21 09:30:39,522::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request
=======================================================

Comment 4 Ovirt-User 2016-03-21 02:04:35 UTC
My ovirt environment :

Ovirt Engine Version : 3.5.4.2-1.el6
Ovirt Host : CentOS 6.6 + Ovirt 3.5 (with fence-agents-3.1.5-48.el6_6.3.x86_64)

Hardware : HP DL580 G5
Fence Device : HP iLO2 (firmware : 2.0.9)

I also test it on centos6.7+ovirt 3.5, but it still had the same error message.

Comment 5 Fabian Deutsch 2016-03-21 10:35:33 UTC
Hey,

are you sure that the host has IP connectivity to the ILO?

I am not sure if the message indicates that the client could not log into the ILO or if the authentication failed.

Comment 6 Ovirt-User 2016-03-22 05:52:14 UTC
Dear Fabian,

I make sure I use the correct ilo id/pw on ovirt-engine management portal.

For example : I had two kvm host (called kvm1 and kvm2)

On KVM1 : 
When I click "test" button to test fence function on "Power Management" page of ovirt-engine admin portal, I got the result of "Test Succeeded, unknown".

On KVM2 :
Then I check the /var/log/vdsm/vdsm.log on kvm2 host, I got the result of "err ['Unable to connect/login to fencing device']".

But I can use the command of "fence_ilo2 --ip=172.21.151.119 --username=xxxxx --password=xxxxx --action=status" SUCESSFULLY.
So I make sure I use the correct ilo id/pw to test fence.

I think it had one point we can take care.
I make sure I use the ilo2 on "Power Management" page, but I check vdsm.log and I found it still use the fence_ilo  not fence_ilo2.
I am not sure if it's root cause or not.

detail log as blow 
========================
JsonRpc (StompReactor)::DEBUG::2016-03-22 13:39:48,084::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2016-03-22 13:39:48,088::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request
Thread-177708::DEBUG::2016-03-22 13:39:48,091::stompReactor::163::yajsonrpc.StompServer::(send) Sending response
Thread-177707::DEBUG::2016-03-22 13:39:48,151::utils::759::root::(execCmd) FAILED: <err> = 'Unable to connect/login to fencing device\n'; <rc> = 1
Thread-177707::DEBUG::2016-03-22 13:39:48,152::API::1164::vds::(fence) rc 1 inp agent=fence_ilo
ipaddr=172.21.151.119
login=isadmin
action=status
passwd=XXXX
ipport=22
ssl=no out [] err ['Unable to connect/login to fencing device']
Thread-177707::DEBUG::2016-03-22 13:39:48,152::API::1235::vds::(fenceNode) rc 1 in agent=fence_ilo
ipaddr=172.21.151.119
login=isadmin
action=status
passwd=XXXX
ipport=22
ssl=no out [] err ['Unable to connect/login to fencing device']
Thread-177707::DEBUG::2016-03-22 13:39:48,152::stompReactor::163::yajsonrpc.StompServer::(send) Sending response
JsonRpc (StompReactor)::DEBUG::2016-03-22 13:39:48,154::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'>
JsonRpcServer::DEBUG::2016-03-22 13:39:48,156::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request
Thread-177709::DEBUG::2016-03-22 13:39:48,159::API::1209::vds::(fenceNode) fenceNode(addr=172.21.151.119,port=,agent=ilo,user=isadmin,passwd=XXXX,action=status,secure=False,options=ipport=22
ssl=no,policy=None)
Thread-177709::DEBUG::2016-03-22 13:39:48,159::utils::739::root::(execCmd) /usr/sbin/fence_ilo (cwd None)
Thread-177709::DEBUG::2016-03-22 13:39:48,552::utils::759::root::(execCmd) FAILED: <err> = 'Unable to connect/login to fencing device\n'; <rc> = 1
Thread-177709::DEBUG::2016-03-22 13:39:48,552::API::1164::vds::(fence) rc 1 inp agent=fence_ilo
ipaddr=172.21.151.119
login=isadmin
action=status
passwd=XXXX
ipport=22
ssl=no out [] err ['Unable to connect/login to fencing device']
=====================

Comment 7 Fabian Deutsch 2016-04-05 09:32:41 UTC
Thanks for the reply.

It looks as if vdsm is using the wrong credentials then, the question is why it get#s the wrong ones. Moving this bug to vdsm for further debugging.

Comment 8 Dan Kenigsberg 2016-04-05 14:23:41 UTC
fenceNode(addr=172.21.151.119,port=,agent=ilo,user=isadmin,passwd=XXXX,action=status,secure=False,options=ipport=22
ssl=no,policy=None)

Vdsm receives an explicit agent=ilo, nothing about ilo2.
Are you sure that you have updated the agent type on Engine?
Can you share a screenshot of "kvm2" power management argument?

Comment 9 Martin Perina 2016-04-06 14:28:38 UTC
We don't use fence_ilo2/ilo/ilo4 agents directly, we map all of those inside engine into fence_ilo agent.

I looked at your tests with fence_ilo2, could you achieve same result with fence_ilo?

 fence_ilo --ip 172.21.151.119 --username=isadmin --password itaq12ws -o status

If so, please take a look at your fence agent configuration in engine. I noticed from vdsm.log that you specified ipport=22 and ssl=no options, could you please remove those option from your fence agent configuration and try to test agent functionality again?

Comment 10 Martin Perina 2016-04-13 13:11:35 UTC
Hi,
any progress with tests described in Comment 9? Were you able to execute them successfully?

Comment 11 Ovirt-User 2016-04-13 13:20:58 UTC
sorry for my late response.
It finally works.

Comment 12 Martin Perina 2016-04-13 14:17:23 UTC
Closing as NOTABUG, feel free to reopen if needed.


Note You need to log in before you can comment on or make changes to this bug.