Bug 508722

Summary: fence_ilo will fail against ilo2 device on BL860c with latest ilo2 firmware
Product: Red Hat Enterprise Linux 5 Reporter: Shane Bradley <sbradley>
Component: cmanAssignee: Marek Grac <mgrac>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4CC: clasohm, cluster-maint, cward, djansa, edamato, jkortus, lscalabr, mgrac, tao
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: cman-2.0.115-26.el5 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 512941 (view as bug list) Environment:
Last Closed: 2010-03-30 08:41:19 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 512941    
Attachments:
Description Flags
Fence agent for HP iLO MP none

Description Shane Bradley 2009-06-29 16:15:51 UTC
Description of problem:

fence_ilo will fail to fence a node off with ilo2 latest firmware.
Customers blade information:
     Type: Server Blade (Itantium)
     Manufacturer: hp
     Product Name: server BL860c

Customer's ilo2 information:
     Type: Integrity iLO 2
     Firmware Version: 3.10
     BMC Version: 5.35 

The older firmware(1.26) will fence the ilo2 machine just fine.

Here is link to firmware they are using that fails:
http://h20000.www2.hp.com/bizsupport/TechSupport/SoftwareIndex.jsp?lang=cs&cc=cz&prodNameId=3351092&prodTypeId=3709945&prodSeriesId=3331044&swLang=13&taskId=135&swEnvOID=1060#29217

-------------------------------------------------------------------------------

Here is example backtrace of the error(with hostname, etc removed):
$ fence_ilo -a <hostaname> -l <login> -p <password> -o reboot

Traceback (most recent call last):
  File "/sbin/fence_ilo", line 102, in ?
    main()
  File "/sbin/fence_ilo", line 73, in main
    conn.log_expect(options, [ "</RIBCL>", "<END_RIBCL/>" ], LOGIN_TIMEOUT)
  File "/usr/lib/fence/fencing.py", line 155, in log_expect
    result = self.expect(pattern, timeout)
  File "/usr/lib/python2.4/site-packages/pexpect.py", line 1311, in expect
    return self.expect_list(compiled_pattern_list, timeout, searchwindowsize)
  File "/usr/lib/python2.4/site-packages/pexpect.py", line 1325, in expect_list
    return self.expect_loop(searcher_re(pattern_list), timeout, searchwindowsize)
  File "/usr/lib/python2.4/site-packages/pexpect.py", line 1396, in expect_loop
    raise EOF (str(e) + '\n' + str(self))
pexpect.EOF: End Of File (EOF) in read_nonblocking(). Exception style platform.
<fencing.fspawn object at 0xb7cb838c>
version: 2.3 ($Revision: 399 $)
command: /usr/lib/fence/telnet_ssl
args: ['/usr/lib/fence/telnet_ssl', '<hostname>', '443']
searcher: searcher_re:
    0: re.compile("</RIBCL>")
    1: re.compile("<END_RIBCL/>")
buffer (last 100 chars): 
before (last 100 chars): ine 62, in main
    read_buff = conn.recv(4096)
OpenSSL.SSL.SysCallError: (-1, 'Unexpected EOF')

after: pexpect.EOF
match: None
match_index: None
exitstatus: None
flag_eof: True
pid: 20162
child_fd: 3
closed: False
timeout: 30
delimiter: pexpect.EOF
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1


Version-Release number of selected component (if applicable):
RHEL5U3

How reproducible:
Reproduces ever time with the commmand below on latest firmware:

Steps to Reproduce:
1. Make sure ilo2 has latest firmware
2. $ fence_ilo -a <hostaname> -l <login> -p <password> -o reboot
  
Actual results:
Error occurs that is in summary

Expected results:
The machine should be rebooted

Additional info:
None

Comment 2 Marek Grac 2009-06-29 17:05:44 UTC
According to: HP Integrity iLO 2 MP Operations Guide, Eighth Edition 

RIBCL is not even mentioned in this guide and RIBCL is interface we use for connecting to ilo/ilo2. Fortunately there is new option (stop -f) in SMASH that offers real power off (not graceful). We will have to create new agent based on SMASH interface.

Comment 3 Marek Grac 2009-06-30 06:44:26 UTC
Created attachment 349910 [details]
Fence agent for HP iLO MP

Fence agent for ilo mp. Currently it can connect only using ssh. Telnet will be supported in final version too (in "fencing.py" fence_login() we have to replace \r\n -> \n; this has to be configurable as it can broke other fence agents).

I will be out for next week, so telnet should works at the end of next week

Comment 4 Marek Grac 2009-07-13 12:09:43 UTC
Only connection through ssh will be supported for this fence agent.

Comment 6 Marek Grac 2009-07-30 10:00:50 UTC
Telnet is also supported as we have same problem with sanbox2 agent

Comment 7 Marek Grac 2009-07-30 10:06:58 UTC
Forgot to add agent in previous commit :(

http://git.fedorahosted.org/git/cluster.git?p=cluster.git;a=commit;h=0c68763dfaed3b2ca3d7f5e107eb813b3b38aee3

Comment 13 Chris Ward 2010-02-11 10:24:52 UTC
~~ Attention Customers and Partners - RHEL 5.5 Beta is now available on RHN ~~

RHEL 5.5 Beta has been released! There should be a fix present in this 
release that addresses your request. Please test and report back results 
here, by March 3rd 2010 (2010-03-03) or sooner.

Upon successful verification of this request, post your results and update 
the Verified field in Bugzilla with the appropriate value.

If you encounter any issues while testing, please describe them and set 
this bug into NEED_INFO. If you encounter new defects or have additional 
patch(es) to request for inclusion, please clone this bug per each request
and escalate through your support representative.

Comment 17 errata-xmlrpc 2010-03-30 08:41:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0266.html