Bug 1563000

Summary: The pxe_ilo driver ignores power requests under certain conditions with HP BL460
Product: Red Hat OpenStack Reporter: coldford <coldford>
Component: python-proliantutilsAssignee: RHOS Maint <rhos-maint>
Status: CLOSED ERRATA QA Contact: Shai Revivo <srevivo>
Severity: high Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: agarwalnisha1980, bfournie, coldford, dvd, hbrock, jslagle, jtrowbri, jzaher, mburns, pmannidi, racedoro, rcernin, rhos-maint, srevivo
Target Milestone: z9Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-proliantutils-2.2.0-4.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1573149 1573150 1573151 (view as bug list) Environment:
Last Closed: 2018-09-17 16:59:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1573149, 1573150, 1573151    
Attachments:
Description Flags
ris.py none

Description coldford@redhat.com 2018-04-02 21:26:36 UTC
Description of problem:
Ironic fails to complete the deployment even though the disk creation is successful and shutdown the node

Version-Release number of selected component (if applicable):
RH OSP 10

How reproducible:
Always


Steps to Reproduce:
1. Import baremetal nodes with pxe_ilo driver with appropriate profile. configure deploy images, introspect the nodes.
2. Run openstack overcloud deploy
3. Controllers were able to finish and the provisioning state is active but the compute node is going into deploy_failed

Actual results:
Compute node deployment failing

Expected results:
compute node deployment failing  with error "iLO failed to change state to power on within 12 sec"

Additional info:
Increased the power_wait timeout to 75 seconds, still facing the same issue. As observed from iLO web-ui, the power state was off for much time until the error message. Manual power on via web-ui is a current pseudo-workaround.

Comment 1 coldford@redhat.com 2018-04-02 21:32:21 UTC
The customer is using the version(2.2.0-3) that was release in the following errata:

https://access.redhat.com/errata/RHBA-2018:0365

Issue returned after they upgraded the firmware.

A case has also been opened with HP.

Comment 15 PURANDHAR SAIRAM MANNIDI 2018-04-04 05:23:37 UTC
Created attachment 1417025 [details]
ris.py

Comment 17 PURANDHAR SAIRAM MANNIDI 2018-04-04 05:43:17 UTC
Created attachment 1417026 [details]
Ironic conductor logs

Comment 18 David Vallee Delisle 2018-04-04 16:57:32 UTC
After adding some debug logs, we see that ris.py is never entering the retry loop because the model returned is matching for Proliant BL but the new model is ProLiant BL. 

I've opened a LaunchPad with a fix: https://bugs.launchpad.net/proliantutils/+bug/1761243

Comment 19 Bob Fournier 2018-04-04 17:37:42 UTC
David - nice find!  I assume that the Product Name is a fixed string and can't be changed through the iLO UI?

Comment 20 Nisha 2018-04-04 18:45:38 UTC
Thanks David for the RCA.

Comment 21 Nisha 2018-05-02 17:16:52 UTC
The fix in proliantutils has been released for this issue as proliantutils v2.5.2. the fix URL is https://review.openstack.org/559906.

Comment 25 Bob Fournier 2018-06-13 13:19:23 UTC
Sai - this will be in the OSP-10z9 release and will be tested when the puddle is available.

Comment 32 Alex McLeod 2018-09-03 07:57:23 UTC
Hi there,

If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field.

The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Thanks,
Alex

Comment 35 errata-xmlrpc 2018-09-17 16:59:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2671