Hide Forgot
My ovirt environment: One ovirt Management Portal, and two ovirt node(Centos 6.6 + VDSM). when one node was fail, the alive node cannot fence the failed node sucessfully. the error message log was as below. ====================== Thread-677::DEBUG::2016-03-18 16:44:21,043::stompReactor::163::yajsonrpc.StompServer::(send) Sending response JsonRpc (StompReactor)::DEBUG::2016-03-18 16:44:21,089::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2016-03-18 16:44:21,091::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-678::DEBUG::2016-03-18 16:44:21,091::__init__::481::jsonrpc.JsonRpcServer::(_serveRequest) Calling 'Task.clear' in bridge with {'taskID': 'f6c04b19-32ac-48e2-9a94-52913529b792'} Thread-678::DEBUG::2016-03-18 16:44:21,094::task::595::Storage.TaskManager.Task::(_updateState) Task=`cbe567e6-72ae-4e77-9c1d-58414b5c16ed`::moving from state init -> state preparing Thread-678::INFO::2016-03-18 16:44:21,094::logUtils::44::dispatcher::(wrapper) Run and protect: clearTask(taskID='f6c04b19-32ac-48e2-9a94-52913529b792', spUUID=None, options=None) Thread-678::DEBUG::2016-03-18 16:44:21,095::taskManager::171::Storage.TaskManager::(clearTask) Entry. taskID: f6c04b19-32ac-48e2-9a94-52913529b792 Thread-676::DEBUG::2016-03-18 16:44:21,101::utils::759::root::(execCmd) FAILED: <err> = 'Unable to connect/login to fencing device\n'; <rc> = 1 Thread-676::DEBUG::2016-03-18 16:44:21,101::API::1164::vds::(fence) rc 1 inp agent=fence_ilo ipaddr=172.21.151.119 login=isadmin action=off passwd=XXXX ipport=22 ssl=no out [] err ['Unable to connect/login to fencing device'] Thread-676::DEBUG::2016-03-18 16:44:21,102::API::1235::vds::(fenceNode) rc 1 in agent=fence_ilo ipaddr=172.21.151.119 login=isadmin action=off passwd=XXXX ipport=22 ssl=no out [] err ['Unable to connect/login to fencing device'] =================================== My hardware was HP DL580 G5 (ilo2) 1. I already install the newest fence package - "fence-agents-3.1.5-48.el6_6.3.x86_64" 2. ilo2 firmware version : 2.0.9 How reproducible: Steps to Reproduce: 1.shutdown one host 2.the second host cannot fence fail host sucessfully. Actual results: Unable to connect/login to fencing device Expected results: Additional info: I manually test the fence funtion on alive host. [root@hqalabkvm02 vdsm]# fence_ilo2 --ip 172.21.151.119 --username=isadmin --password itaq12ws -o status Status: OFF you can find that I can get the ilo status of failed host, but actually I cannot fence it sucessfully. what log should I should provide ?
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
The command above (from VDSM log) shows that you've tried to use ilo, not ilo2. On the manual run, you've tried to use ilo2. Can you re-verify oVirt with ilo2?
Hi Kaul, thanks for your response. I make sure I had setup to use ilo2 on ovirt "Power Management" page, and I also test it, but I got the "test succeeded, unknown" result. Then I check the vdsm log as below. I don't know why it still used fence_ilo not fence_ilo2. If I should provide any logs or error screen shot, you could contact with me. ============================================= Dummy-232::DEBUG::2016-03-21 09:30:35,930::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) dd if=/rhev/data-center/00000002-0002-0002-0002-00000000006f/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None) Dummy-232::DEBUG::2016-03-21 09:30:35,999::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.0129347 s, 79.2 MB/s\n'; <rc> = 0 Thread-117227::DEBUG::2016-03-21 09:30:35,999::utils::759::root::(execCmd) FAILED: <err> = 'Unable to connect/login to fencing device\n'; <rc> = 1 Thread-117227::DEBUG::2016-03-21 09:30:36,003::API::1164::vds::(fence) rc 1 inp agent=fence_ilo ipaddr=172.21.151.119 login=isadmin action=status passwd=XXXX ipport=22 ssl=no out [] err ['Unable to connect/login to fencing device'] Thread-117227::DEBUG::2016-03-21 09:30:36,005::API::1235::vds::(fenceNode) rc 1 in agent=fence_ilo ipaddr=172.21.151.119 login=isadmin action=status passwd=XXXX ipport=22 ssl=no out [] err ['Unable to connect/login to fencing device'] Thread-117227::DEBUG::2016-03-21 09:30:36,008::stompReactor::163::yajsonrpc.StompServer::(send) Sending response JsonRpc (StompReactor)::DEBUG::2016-03-21 09:30:36,493::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2016-03-21 09:30:36,496::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-117228::DEBUG::2016-03-21 09:30:36,500::stompReactor::163::yajsonrpc.StompServer::(send) Sending response Dummy-232::DEBUG::2016-03-21 09:30:38,027::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) dd if=/rhev/data-center/00000002-0002-0002-0002-00000000006f/mastersd/dom_md/inbox iflag=direct,fullblock count=1 bs=1024000 (cwd None) Dummy-232::DEBUG::2016-03-21 09:30:38,095::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.0128294 s, 79.8 MB/s\n'; <rc> = 0 JsonRpc (StompReactor)::DEBUG::2016-03-21 09:30:39,520::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2016-03-21 09:30:39,522::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request =======================================================
My ovirt environment : Ovirt Engine Version : 3.5.4.2-1.el6 Ovirt Host : CentOS 6.6 + Ovirt 3.5 (with fence-agents-3.1.5-48.el6_6.3.x86_64) Hardware : HP DL580 G5 Fence Device : HP iLO2 (firmware : 2.0.9) I also test it on centos6.7+ovirt 3.5, but it still had the same error message.
Hey, are you sure that the host has IP connectivity to the ILO? I am not sure if the message indicates that the client could not log into the ILO or if the authentication failed.
Dear Fabian, I make sure I use the correct ilo id/pw on ovirt-engine management portal. For example : I had two kvm host (called kvm1 and kvm2) On KVM1 : When I click "test" button to test fence function on "Power Management" page of ovirt-engine admin portal, I got the result of "Test Succeeded, unknown". On KVM2 : Then I check the /var/log/vdsm/vdsm.log on kvm2 host, I got the result of "err ['Unable to connect/login to fencing device']". But I can use the command of "fence_ilo2 --ip=172.21.151.119 --username=xxxxx --password=xxxxx --action=status" SUCESSFULLY. So I make sure I use the correct ilo id/pw to test fence. I think it had one point we can take care. I make sure I use the ilo2 on "Power Management" page, but I check vdsm.log and I found it still use the fence_ilo not fence_ilo2. I am not sure if it's root cause or not. detail log as blow ======================== JsonRpc (StompReactor)::DEBUG::2016-03-22 13:39:48,084::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2016-03-22 13:39:48,088::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-177708::DEBUG::2016-03-22 13:39:48,091::stompReactor::163::yajsonrpc.StompServer::(send) Sending response Thread-177707::DEBUG::2016-03-22 13:39:48,151::utils::759::root::(execCmd) FAILED: <err> = 'Unable to connect/login to fencing device\n'; <rc> = 1 Thread-177707::DEBUG::2016-03-22 13:39:48,152::API::1164::vds::(fence) rc 1 inp agent=fence_ilo ipaddr=172.21.151.119 login=isadmin action=status passwd=XXXX ipport=22 ssl=no out [] err ['Unable to connect/login to fencing device'] Thread-177707::DEBUG::2016-03-22 13:39:48,152::API::1235::vds::(fenceNode) rc 1 in agent=fence_ilo ipaddr=172.21.151.119 login=isadmin action=status passwd=XXXX ipport=22 ssl=no out [] err ['Unable to connect/login to fencing device'] Thread-177707::DEBUG::2016-03-22 13:39:48,152::stompReactor::163::yajsonrpc.StompServer::(send) Sending response JsonRpc (StompReactor)::DEBUG::2016-03-22 13:39:48,154::stompReactor::98::Broker.StompAdapter::(handle_frame) Handling message <StompFrame command='SEND'> JsonRpcServer::DEBUG::2016-03-22 13:39:48,156::__init__::530::jsonrpc.JsonRpcServer::(serve_requests) Waiting for request Thread-177709::DEBUG::2016-03-22 13:39:48,159::API::1209::vds::(fenceNode) fenceNode(addr=172.21.151.119,port=,agent=ilo,user=isadmin,passwd=XXXX,action=status,secure=False,options=ipport=22 ssl=no,policy=None) Thread-177709::DEBUG::2016-03-22 13:39:48,159::utils::739::root::(execCmd) /usr/sbin/fence_ilo (cwd None) Thread-177709::DEBUG::2016-03-22 13:39:48,552::utils::759::root::(execCmd) FAILED: <err> = 'Unable to connect/login to fencing device\n'; <rc> = 1 Thread-177709::DEBUG::2016-03-22 13:39:48,552::API::1164::vds::(fence) rc 1 inp agent=fence_ilo ipaddr=172.21.151.119 login=isadmin action=status passwd=XXXX ipport=22 ssl=no out [] err ['Unable to connect/login to fencing device'] =====================
Thanks for the reply. It looks as if vdsm is using the wrong credentials then, the question is why it get#s the wrong ones. Moving this bug to vdsm for further debugging.
fenceNode(addr=172.21.151.119,port=,agent=ilo,user=isadmin,passwd=XXXX,action=status,secure=False,options=ipport=22 ssl=no,policy=None) Vdsm receives an explicit agent=ilo, nothing about ilo2. Are you sure that you have updated the agent type on Engine? Can you share a screenshot of "kvm2" power management argument?
We don't use fence_ilo2/ilo/ilo4 agents directly, we map all of those inside engine into fence_ilo agent. I looked at your tests with fence_ilo2, could you achieve same result with fence_ilo? fence_ilo --ip 172.21.151.119 --username=isadmin --password itaq12ws -o status If so, please take a look at your fence agent configuration in engine. I noticed from vdsm.log that you specified ipport=22 and ssl=no options, could you please remove those option from your fence agent configuration and try to test agent functionality again?
Hi, any progress with tests described in Comment 9? Were you able to execute them successfully?
sorry for my late response. It finally works.
Closing as NOTABUG, feel free to reopen if needed.