Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1385114 - Deployment fails with Ironic API errors and nodes stuck in wait-call-back when one of the macs addresses of the node is of type infiniband
Deployment fails with Ironic API errors and nodes stuck in wait-call-back whe...
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic (Show other bugs)
10.0 (Newton)
Unspecified Unspecified
unspecified Severity unspecified
: ga
: 10.0 (Newton)
Assigned To: Lucas Alvares Gomes
Raviv Bar-Tal
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-10-14 15:05 EDT by Sai Sindhur Malleni
Modified: 2016-12-14 11:19 EST (History)
7 users (show)

See Also:
Fixed In Version: openstack-ironic-6.2.1-3.el7ost
Doc Type: Bug Fix
Doc Text:
To determine which node is being deployed, the deploy ramdisk (IPA) provides the Bare Metal provisioning service with a list of MAC addresses as unique identifiers for that node. In previous releases, the Bare Metal provisioning service only expected normal MAC address formats; namely, 6 octets. The GID of Infiniband NICs, however, have 20 octets. As such, whenever an Infiniband NIC was present on the node, the deployment would fail since the Bare Metal provisioning API could not validate the MAC address correctly. With this release, the Bare Metal provisioning service now ignores MAC addresses that don't conform with the normal MAC address format of 6 octets.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-12-14 11:19:31 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 392114 None None None 2016-11-04 08:30 EDT
Red Hat Product Errata RHEA-2016:2948 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 14:55:27 EST

  None (edit)
Description Sai Sindhur Malleni 2016-10-14 15:05:28 EDT
Description of problem:

When one of the interfaces on the node has a MAC address of type infiniband(MAC greater than 6 octects), we see 400 errors in the ironic api such as:

2016-10-14 14:42:03.171 18045 DEBUG wsme.api [req-a257a01c-0896-435a-8416-70a03bf50a56 - - - - -] Client-side error: Invalid input for field/attribute addresses. Value: '80:00:02:48:fe:80:00:00:00:00:00:00:f4:52:14:03:00:54:06:c2,f4:52:14:54:06:c1,a0:d3:c1:04:44:17,a0:d3:c1:04:44:14,a0:d3:c1:04:44:16,a0:d3:c1:04:44:15'. unable to convert to list format_exception /usr/lib/python2.7/site-packages/wsme/api.py:221
2016-10-14 14:42:03.174 18045 INFO eventlet.wsgi.server [req-a257a01c-0896-435a-8416-70a03bf50a56 - - - - -] 192.0.2.9 "GET /v1/lookup?addresses=80%3A00%3A02%3A48%3Afe%3A80%3A00%3A00%3A00%3A00%3A00%3A00%3Af4%3A52%3A14%3A03%3A00%3A54%3A06%3Ac2%2Cf4%3A52%3A14%3A54%3A06%3Ac1%2Ca0%3Ad3%3Ac1%3A04%3A44%3A17%2Ca0%3Ad3%3Ac1%3A04%3A44%3A14%2Ca0%3Ad3%3Ac1%3A04%3A44%3A16%2Ca0%3Ad3%3Ac1%3A04%3A44%3A15 HTTP/1.1" status: 400 len: 657 time: 0.0098739

It is worth mentioning that introspection succeeds but when deploying overcloud the nodes are stuck in wait-call-back and we see ironic api 400 errors on the console of the nodes and undercloud ironic logs. Eventually deployment fails with no valid host found errors.

Version-Release number of selected component (if applicable):


How reproducible:
100% when one of the interfaces in infiniband

Steps to Reproduce:
1. Install undercloud
2. Introspect
3. deploy

Actual results:
Nodes are stuck in wait-call-back and eventually deployment fails with no valid host found errors.

Expected results:
Deployment should succeed

Additional info:
Talking to Lucas(lucasagomes) on IRC, he says currently infiniband isn't supported and verified it as follows:
http://paste.openstack.org/show/585738/
Also, fwiw, we had earlier versions of OSP(9,8) working on the same environment.
Comment 1 Dmitry Tantsur 2016-10-17 13:52:56 EDT
Looks like our new ramdisk API has broken it.. I wonder if we should validate MACs at all in lookup.
Comment 4 Raviv Bar-Tal 2016-11-17 04:40:25 EST
Hi,
Unfortunately I don't access to infiniband nic's , and I can not test this but.
From the above comments I see RAM disk API was broken and  got fix.
Can you verify / advice if new OSPD is working for you and bug can be closed?
Comment 6 Sai Sindhur Malleni 2016-12-06 01:25:37 EST
I can confirm that I'm not seeing this error in RC.
Comment 8 errata-xmlrpc 2016-12-14 11:19:31 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html

Note You need to log in before you can comment on or make changes to this bug.