Created attachment 943747 [details] engine.log, vdsm.log Description of problem: ipmilan test fails in Power Management part of Edit Host dialog, my ipmi credentials are OK and via ipmitool from proxy host it works as well. [root@bandelier ~]# ipmitool -I lanplus -H ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com -U root power status Password: Chassis Power is on 2014-10-03 16:22:03,806 INFO [org.ovirt.engine.core.bll.FenceExecutor] (ajp-/127.0.0.1:8702-7) Executing <Status> Power Management command, Proxy Host:bandelier.lab.bos.redhat.com, Agent:ipmilan, Target Host:, Management IP:ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com, User:root, Options: 2014-10-03 16:22:11,115 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp-/127.0.0.1:8702-7) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Power Management test failed for Host ibm-p8-rhevm-hv-01.lab.bos.redhat.com.('', 'Authentication type NONE not supported\nError: Unable to establish LAN session\n') - i create root user as Admin Portal doesn't take empty username (BZ1149210) $ ipmitool -I lanplus -H ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com -U root shell Password: ipmitool> user list ID Name Callin Link Auth IPMI Msg Channel Priv Limit 1 true false true ADMINISTRATOR 2 root true false true ADMINISTRATOR Thread-42::DEBUG::2014-10-03 10:22:04,673::API::1133::vds::(fenceNode) fenceNode(addr=ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com,port=,agent=ipmilan,user=root,passwd=XXXX,action=status,secure=,options=) Thread-42::DEBUG::2014-10-03 10:22:08,740::API::1159::vds::(fenceNode) rc 1 in agent=fence_ipmilan ipaddr=ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com login=root action=status passwd=XXXX out err ('', 'Authentication type NONE not supported\nError: Unable to establish LAN session\n') Failed: Unable to obtain correct plug status or plug is not available # tcpdump -i rhevm -nn -ttt host 10.16.44.49 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on rhevm, link-type EN10MB (Ethernet), capture size 65535 bytes 00:00:00.000000 IP 10.16.42.56.41340 > 10.16.44.49.623: UDP, length 12 00:00:02.002026 IP 10.16.42.56.41340 > 10.16.44.49.623: UDP, length 23 00:00:02.007235 IP 10.16.42.56.41340 > 10.16.44.49.623: UDP, length 23 00:00:00.000684 IP 10.16.44.49.623 > 10.16.42.56.41340: UDP, length 31 00:00:00.366062 IP 10.16.42.56.50623 > 10.16.44.49.623: UDP, length 12 00:00:02.002029 IP 10.16.42.56.50623 > 10.16.44.49.623: UDP, length 23 00:00:02.007227 IP 10.16.42.56.50623 > 10.16.44.49.623: UDP, length 23 00:00:00.000585 IP 10.16.44.49.623 > 10.16.42.56.50623: UDP, length 31 00:00:00.557610 ARP, Request who-has 10.16.42.56 tell 10.16.44.49, length 46 00:00:00.000005 ARP, Reply 10.16.42.56 is-at 6c:ae:8b:6a:de:e0, length 28 I tried again my ipmi on dell r210 and it worked ok, it was 36 packagets. Seems to me like ipmi problem from vdsm part. Version-Release number of selected component (if applicable): vdsm-4.14.17-1.mrkev.ppc64 How reproducible: 100% Steps to Reproduce: 1. two ppc64 hosts, one with ipmi 2. go to power mgmt part of host configuration, fill credentials and click 'test] button 3. Actual results: Test Failed, ('', 'Authentication type NONE not supported\nError: Unable to establish LAN session\n') Failed: Unable to obtain correct plug status or plug is not available Expected results: should work (as ipmitool from proxy host works OK), maybe some timeout? Additional info:
FYI /usr/share/vdsm/API.py file in vdsm-4.14.17-1.mrkev.ppc64 is same as on vdsm-4.14.17-1.el6ev.x86_64
Well I doubt it makes sense to allow to close the dialog if [test] didn't work. What is the benefit of that? Events: 2014-Oct-03, 18:21 Failed to verify Host bandelier.lab.bos.redhat.com power management. 2014-Oct-03, 18:20 Host ibm-p8-rhevm-hv-01.lab.bos.redhat.com from cluster ppc64 was chosen as a proxy to execute Status command on Host bandelier.lab.bos.redhat.com. 2014-Oct-03, 18:20 State was set to Up for host bandelier.lab.bos.redhat.com. 2014-Oct-03, 18:20 Failed to restart Host bandelier.lab.bos.redhat.com, (User: admin). 2014-Oct-03, 18:20 Failed to stop Host bandelier.lab.bos.redhat.com, (User: admin). 2014-Oct-03, 18:20 Failed to power fence host bandelier.lab.bos.redhat.com. Please check the host status and it's power management settings, and then manually reboot it and click "Confirm Host Has Been Rebooted" 2014-Oct-03, 18:20 Failed to verify Host bandelier.lab.bos.redhat.com Restart status, Please Restart Host bandelier.lab.bos.redhat.com manually. 2014-Oct-03, 18:20 Host ibm-p8-rhevm-hv-01.lab.bos.redhat.com from cluster ppc64 was chosen as a proxy to execute Status command on Host bandelier.lab.bos.redhat.com. 2014-Oct-03, 18:20 Host ibm-p8-rhevm-hv-01.lab.bos.redhat.com from cluster ppc64 was chosen as a proxy to execute Stop command on Host bandelier.lab.bos.redhat.com. 2014-Oct-03, 18:20 Host ibm-p8-rhevm-hv-01.lab.bos.redhat.com from cluster ppc64 was chosen as a proxy to execute Status command on Host bandelier.lab.bos.redhat.com.
Looking at your vdsm log I see that the options field is empty. Your ipmitool command used the -I option which mean using the lanplus interface Please retry after adding 'lanplus' to your options field in the Host Edit dialog PM TAB If this works and all PPC hosts with ipmi PM need lanplus to be used, we can consider handling this implicitly for 3.6
So this IBM POWER8 IPMI interface doesn't respond to query with 'lan' as my Dell R210 does. It does respond - as stated above - to a query with 'lanplus' defined as interface. I don't know what is valid syntax for option in PM dialog. Just 'lanplus', 'interface=lanplus' wasn't successful. Anyway, we know that problem is 'lan' vs 'lanplus' for interface.
Created attachment 944161 [details] vdsm.log
Marek This seems to be a bug in fence-agents-ipmilan for PPC. I had succeeded to run the same command using fence-agents-3.1.5-48.el6.x86_64 on a non-ppc RHEL 6.6 machine Please advice
Jiri Please try again with 'lunplus,cipher=1' in the options field.
Thread-659::DEBUG::2014-10-06 10:04:58,081::API::1159::vds::(fenceNode) rc 1 in agent=fence_ipmilan ipaddr=ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com login=root action=status passwd=XXXX cipher=1 lanplus out err ('', 'Authentication type NONE not supported\nError: Unable to establish LAN session\n') Failed: Unable to obtain correct plug status or plug is not available # ipmitool -C 1 -I lanplus -H ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com -U root user list Password: ID Name Callin Link Auth IPMI Msg Channel Priv Limit 1 true false true ADMINISTRATOR 2 root true false true ADMINISTRATOR
Eli, how is it a Vdsm issue? Engine does not provide any port or the options cipher=1, privlvl=administrator Thread-201::DEBUG::2014-10-05 10:00:31,976::API::1133::vds::(fenceNode) fenceNode(addr=ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com,port=,agent=ipmilan,user=root,passwd=XXXX,action=status,secure=,options=lanplus) so the fence agent complains about their missing. Dummy-199::DEBUG::2014-10-05 10:00:35,362::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.0149388 s, 68.5 MB/s\n'; <rc> = 0 Thread-201::DEBUG::2014-10-05 10:00:36,049::API::1159::vds::(fenceNode) rc 1 in agent=fence_ipmilan ipaddr=ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com login=root action=status passwd=XXXX lanplus out err ('', 'Authentication type NONE not supported\nError: Unable to establish LAN session\n') Failed: Unable to obtain correct plug status or plug is not available Jiri, please retry while setting the special options of lanplus=1 cipher=1 privlvl=administrator just like suggested by Marek.
this code works over ppc and returns the right value: import subprocess script='/usr/sbin/fence_ipmilan' p = subprocess.Popen([script], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, close_fds=True) parm = 'agent=fence_ipmilan\nipaddr=ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com\nlogin=root\naction=status\npasswd=redhat\nlanplus=1\ncipher=1\nprivlvl=administrator' print parm p.stdin.write(parm) p.stdin.close() print p.stdout.read() print p.stderr.read() so its not the fence_ipmilan script. now i just need to figure what's different in vdsm, because this script does it the same as vdsm does with popen.
Dan, running vdsClient -s 0 fenceNode ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com "" ipmilan root redhat status false "lanplus=1\ncipher=1\nprivlvl=administrator" fails. we do add the options , same as in x86 maybe the log you look at has something different that wrong, but passing the options as we do here ^ , returns that failure.
Yaniv, the shell would not translate your \n; they are passed verbatim to the fence script. The following works just fine # vdsClient -s 0 fenceNode ibm-p8-rhevm-hv-01-fsp.lab.bos.redhat.com "" ipmilan root redhat status false "lanplus=1 cipher=1 privlvl=administrator" But that's besides the point. cipher and privlvl were never passed from Engine (I suspect they were never defined on the host fencing parmeters). I ask Jiri to place them properly and try again.
ipmilan options in PM part of Edit Host dialog: lanplus=1,cipher=1 makes the test _pass successfully_. Please change severity according to you, I don't know if you would document this option anywhere or you would make it default.
Works even with "old" powerkvm fence agents (restored from backup - 4.0.6-1.pkvm2_1.2). So the issue is either documentation problem or to tune defaults.
I can confirm that with 'lanplus=1,cipher=1' option it works even from ibm-p8-rhevm-hv-01.lab.bos.redhat.com which had 4.0.6 fence clients.
I think we should have this as a release note, as different options might fit different environments, so it shouldn't be the default. Scott?
(In reply to Oved Ourfali from comment #36) > I think we should have this as a release note, as different options might > fit different environments, so it shouldn't be the default. > > Scott? It seems the default doesn't work at all. I also assume it's not a light lift based on current timelines to have a special default for Power Hosts. If we can get a default set for Power Hosts that works OOTB, let's try to do so, otherwise, we can flag for a release note and target a z-stream long-term fix.
(In reply to Scott Herold from comment #37) > (In reply to Oved Ourfali from comment #36) > > I think we should have this as a release note, as different options might > > fit different environments, so it shouldn't be the default. > > > > Scott? > > It seems the default doesn't work at all. I also assume it's not a light > lift based on current timelines to have a special default for Power Hosts. > > If we can get a default set for Power Hosts that works OOTB, let's try to do > so, otherwise, we can flag for a release note and target a z-stream > long-term fix. I don't see it happening in this time frame. In addition, I'm not sure as for the implications of putting that is the default. Perhaps in other configuration, even for PPC, it wouldn't work. I think we should flag for a release note, and target a z-stream long-term fix, once we understand the implications and get to the right fix.
ok, vt9
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0158.html