| Summary: | No valid interfaces found during introspection | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Dmitry Tantsur <dtantsur> | |
| Component: | openstack-ironic-python-agent | Assignee: | Dmitry Tantsur <dtantsur> | |
| Status: | CLOSED ERRATA | QA Contact: | Raviv Bar-Tal <rbartal> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 8.0 (Liberty) | CC: | chris.brown, cpaquin, dprince, ibravo, johfulto, mburns, michele, rscarazz, sasha, slinaber, tdunnon | |
| Target Milestone: | async | |||
| Target Release: | 8.0 (Liberty) | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | openstack-ironic-python-agent-1.1.0-9.el7ost | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | 1322892 | |||
| : | 1346022 (view as bug list) | Environment: | ||
| Last Closed: | 2016-06-15 12:39:14 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | 1322892 | |||
| Bug Blocks: | 1346022 | |||
|
Description
Dmitry Tantsur
2016-04-14 15:14:08 UTC
Note: backported to mitaka/OSPd9, still an issue for OSPd8. I'm seeing this in OSP8 GA. The upstream fix [1] makes inspection wait 60 seconds for *all* NIC's to get their IP addresses. Looks like a custom image with this change should be a workaround. A crude workaround is to delete and re-introspect on a loop until all of the nodes return no errors [2]. As it's a race condition this crude approach worked for me with on the third iteration with six nodes introspecting with ironic-python-agent-8.0-20160415.1.el7ost.tar. [1] https://git.openstack.org/cgit/openstack/ironic-python-agent/commit/?id=3fba1ee8db0aa0b1519ef2135e602268488570f4 [2] while true; do for uuid in $(ironic node-list | awk {'print $2'} | egrep -v "UUID|^$"); do ironic node-set-power-state $uuid off ironic node-delete $uuid done openstack baremetal import --json ~/instackenv.json openstack baremetal configure boot openstack baremetal introspection bulk start intro_done=$(openstack baremetal introspection bulk status | awk {'print $6'} | egrep -v "^$|\||None") if [[ -z "$intro_done" ]]; then break fi done Since I've got the same exact problem, I generated the ramdisk logs while the action was failing, you can download them from here [1]. The output of the introspection was the one mentioned above:
...
...
Introspection completed with errors:
94f12e03-c04e-474f-95ca-6c323e9aa6b0: Preprocessing hook validate_interfaces: No suitable interfaces found in {u'eth3': {'ip': None, 'mac': u'b8:ca:3a:66:f4:42'}, u'eth2': {'ip': None, 'mac': u'b8:ca:3a:66:f4:45'}, u'eth1': {'ip': None, 'mac': u'b8:ca:3a:66:f4:40'}, u'eth0': {'ip': None, 'mac': u'b8:ca:3a:66:f4:44'}}
065dfc49-4831-4037-9aee-4b4b61765b34: Preprocessing hook validate_interfaces: No suitable interfaces found in {u'eth3': {'ip': None, 'mac': u'b8:ca:3a:66:d6:da'}, u'eth2': {'ip': None, 'mac': u'b8:ca:3a:66:d6:dd'}, u'eth1': {'ip': None, 'mac': u'b8:ca:3a:66:d6:d8'}, u'eth0': {'ip': None, 'mac': u'b8:ca:3a:66:d6:dc'}}
41135c0b-2584-479c-9b10-226f94d4c41b: Preprocessing hook validate_interfaces: No suitable interfaces found in {u'eth3': {'ip': None, 'mac': u'b8:ca:3a:66:e3:82'}, u'eth2': {'ip': None, 'mac': u'b8:ca:3a:66:e3:85'}, u'eth1': {'ip': None, 'mac': u'b8:ca:3a:66:e3:80'}, u'eth0': {'ip': None, 'mac': u'b8:ca:3a:66:e3:84'}}
fce0d063-ed57-4f48-beb1-a10b0af3631d: Preprocessing hook validate_interfaces: No suitable interfaces found in {u'eth3': {'ip': None, 'mac': u'b8:ca:3a:66:ea:5a'}, u'eth2': {'ip': None, 'mac': u'b8:ca:3a:66:ea:58'}, u'eth1': {'ip': None, 'mac': u'b8:ca:3a:66:ea:5d'}, u'eth0': {'ip': None, 'mac': u'b8:ca:3a:66:ea:5c'}}
37063fce-a18b-42a9-9a85-5f67d2fe4ed5: Preprocessing hook validate_interfaces: No suitable interfaces found in {u'eth3': {'ip': None, 'mac': u'b8:ca:3a:66:d7:02'}, u'eth2': {'ip': None, 'mac': u'b8:ca:3a:66:d7:05'}, u'eth1': {'ip': None, 'mac': u'b8:ca:3a:66:d7:00'}, u'eth0': {'ip': None, 'mac': u'b8:ca:3a:66:d7:04'}}
[1] http://file.rdu.redhat.com/~rscarazz/BZ1327255/
Raoul, weird, it looks like your images are old. Could you please paste output of the following: curl -s -H "X-Auth-Token: $(openstack token issue -f value -c id)" http://127.0.0.1:5050/v1/introspection/<UUID>/data | jq '.inventory.interfaces' where <UUID> is any node with failed introspection? Oh, one fix: the command should be run for a node with *successful* introspection, not failed. Otherwise the data won't be there. Yes, in fact doing the command on a unsuccessful node gives just "nul", in a successful one this is the output: [stack@macb8ca3a66dcd8 ~]$ curl -s -H "X-Auth-Token: $(openstack token issue -f value -c id)" http://127.0.0.1:5050/v1/introspection/41135c0b-2584-479c-9b10-226f94d4c41b/data | jq '.inventory.interfaces' [ { "mac_address": "b8:ca:3a:66:e3:80", "ipv4_address": "10.1.241.7", "switch_chassis_descr": null, "switch_port_descr": null, "has_carrier": true, "name": "eth1" }, { "mac_address": "b8:ca:3a:66:e3:84", "ipv4_address": null, "switch_chassis_descr": null, "switch_port_descr": null, "has_carrier": false, "name": "eth0" }, { "mac_address": "b8:ca:3a:66:e3:85", "ipv4_address": null, "switch_chassis_descr": null, "switch_port_descr": null, "has_carrier": false, "name": "eth2" }, { "mac_address": "b8:ca:3a:66:e3:82", "ipv4_address": "192.0.2.103", "switch_chassis_descr": null, "switch_port_descr": null, "has_carrier": true, "name": "eth3" } ] And one more thing: please grab the inspector logs for the relevant time (sudo journalctl -u openstack-ironic-inspector) Done, they are here: http://file.rdu.redhat.com/~rscarazz/BZ1327255/20160506_openstack-ironic-inspector.logs Thanks, that seems to explain the problem. We rely on /sys/class/net/XXX/carrier to detect if we have to wait for an IP address. In your case carrier=0 for all interfaces, even though it's obviously non-sense. We have to stop relying on it. Hi Dmitry, I recently patched os-net-config to use operstate instead. Not sure if it's worth back porting that patch. I don't think operstate is any more reliable. It's also not strictly what I'm looking for: I wanted to optimize the case when the cable is not there at all, but I'll just drop the optimization for slower but more reliable code path. Hi Chris, Can you advice if the patching of os-net-config to use operstate, worked and solved the problem? Thanks Raviv *** Bug 1337659 has been marked as a duplicate of this bug. *** (In reply to Raviv Bar-Tal from comment #13) > Hi Chris, > Can you advice if the patching of os-net-config to use operstate, worked and > solved the problem? Not tried it in this scenario. My primary reason is that Lenovo and IBM hardware comes with USB to Etherley adapters for out of band management which current os-net-config evaluates as a valid interface and attempts to provision if you use the nic1, nic2 etc naming scheme for network interfaces on the overcloud nic-config yaml files. Hello, This is still not working despite update to current-ci and rebuild of images. Hi! If it's about OSPd, please provide your images version. If it's about RDO please move it to https://bugzilla.redhat.com/show_bug.cgi?id=1322892, provide version of IPA available in your repositories and the ramdisk logs as explained in https://bugzilla.redhat.com/show_bug.cgi?id=1322892#c1. This bug is about OSPd specifically. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1229 |