Bug 1185823

Summary: When registering bare metals they stay in "discovering" status forever
Product: Red Hat OpenStack Reporter: Udi Kalifon <ukalifon>
Component: openstack-tripleoAssignee: James Slagle <jslagle>
Status: CLOSED ERRATA QA Contact: Udi Kalifon <ukalifon>
Severity: medium Docs Contact:
Priority: medium    
Version: DirectorCC: calfonso, dsneddon, mburns, rhel-osp-director-maint, yeylon
Target Milestone: gaKeywords: TestOnly, Triaged, ZStream
Target Release: Director   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-05 13:50:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Packet capture from discovery registration on bare metal none

Description Udi Kalifon 2015-01-26 11:29:05 UTC
Description of problem:
When registering a new node by using the IPMI driver, giving the management IP/user/password, and choosing "discover missing attributes" - the node is stuck in "discovering" status forever


Version-Release number of selected component (if applicable):
openstack-tripleo-0.0.5-2c3fb309727671130a32b4c19de48ec22c8530aa1.el7ost.noarch
openstack-tripleo-heat-templates-0.7.9-10.el7ost.noarch
openstack-tripleo-image-elements-0.8.10-19.el7ost.noarch
python-tuskarclient-0.1.15-3.el7ost.noarch
openstack-tuskar-0.4.15-4.el7ost.noarch
openstack-tuskar-ui-0.2.0-10.el7ost.noarch
openstack-tuskar-ui-extras-0.0.2-2.el7ost.noarch
openstack-ironic-api-2014.2-3.el7ost.noarch
openstack-ironic-conductor-2014.2-3.el7ost.noarch
openstack-ironic-common-2014.2-3.el7ost.noarch
openstack-ironic-discoverd-0.2.5-1.el7ost.noarch
python-ironicclient-0.3.1-1.el7ost.noarch


How reproducible:
100%


Steps to Reproduce:
1. From the "Nodes" screen click on "Register Nodes"
2. Provide the details of a real machine 
3. Choose the "Discover missing attributes" option 
4. Click the Register button


Actual results:
The power state of the machine is discovered correctly, and you can even turn the machine off - but the discovery process never ends.

Comment 2 James Slagle 2015-01-28 21:21:57 UTC
we haven't done significant baremetal testing for the TripleO Tech Preview. I'm not sure if baremetal + discovery has been tried before given time and resource constraints. Just lowering the priority on this one to indicate that baremetal is not high priority right now.

Comment 4 Dan Sneddon 2015-02-13 20:42:51 UTC
I am running into this bug on bare metal, and I decided to do a little bit of troubleshooting. I ran a tcpdump on the undercloud host and I see:

Server<-client
<- ARP request
-> ARP response
<- ICMP Echo Request
-> ICMP Echo Reply
<- TCP SYN port 5050
-> TCP SYN/ACK
<- TCP ACK
<- HTTP port 5050 POST /v1/continue HTTP/1.1
-> HTTP 202 accepted

Of course, even though the discoveryd HTTP server is accepting the JSON sent from the discovery client, the state never changes.

Comment 5 Dan Sneddon 2015-02-13 20:45:38 UTC
Created attachment 991594 [details]
Packet capture from discovery registration on bare metal

This is a packet capture from the undercloud host that shows the discovery image connecting and registering with the discoveryd daemon. The HTTP response is 202 accepted, but the host never gets properly registered.

Comment 8 errata-xmlrpc 2015-08-05 13:50:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1549