RDO tickets are now tracked in Jira https://issues.redhat.com/projects/RDO/issues/
Bug 1211352 - discovery hangs
Summary: discovery hangs
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RDO
Classification: Community
Component: rdo-manager
Version: trunk
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: Kilo
Assignee: John Trowbridge
QA Contact: yeylon@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-13 17:24 UTC by wes hayutin
Modified: 2016-04-18 06:49 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-08-25 14:53:31 UTC
Embargoed:


Attachments (Terms of Use)
list of rpms installed on the undercloud (33.05 KB, text/plain)
2015-04-13 17:24 UTC, wes hayutin
no flags Details
failed VM during node discovery (16.57 KB, image/png)
2015-04-14 11:28 UTC, Jan Provaznik
no flags Details

Description wes hayutin 2015-04-13 17:24:03 UTC
Created attachment 1014015 [details]
list of rpms installed on the undercloud

Description of problem:

ironic discover of nodes hangs..

[stack@localhost ~]$ instack-ironic-deployment  --discover-nodes
/usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient.
  'python-keystoneclient.', DeprecationWarning)
Preparing for deployment...
  Discovering nodes.
    Sending node ID 767e7d51-a263-4242-8699-c877fa7a2ef5 to discoverd for discovery ... DONE.
    Sending node ID 05c306c1-4270-40dc-8f8b-878306065aa3 to discoverd for discovery ... DONE.
   Polling discoverd for discovery results ... 
       Result for node 767e7d51-a263-4242-8699-c877fa7a2ef5 is ... ^C
[stack@localhost ~]$

Comment 1 jliberma@redhat.com 2015-04-13 17:35:23 UTC
I have the same issue. If I go to another terminal and run instack-ironic-deployment --show-profile, it looks as though everything worked successfully:

[root@rhos0 jliberma]# instack-ironic-deployment --show-profile
/usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient.
  'python-keystoneclient.', DeprecationWarning)
Preparing for deployment...
/usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient.
  'python-keystoneclient.', DeprecationWarning)
  Querying assigned profiles ... 

    2933b0c6-9fd4-4db9-97ec-98ccbfaf298c
      "boot_option:local"

    fa3a75d7-af88-4bd9-a56e-b5b3dacbc90c
      "boot_option:local"

    6dc53380-56b8-4246-9df2-1d3cc7453244
      "boot_option:local"

    4e7129b3-404e-43a2-b4b0-48a3ba268917
      "boot_option:local"


  DONE.

Prepared.

Comment 2 John Trowbridge 2015-04-13 18:35:41 UTC
@jliberma,
Actually in that "show-profile" output, we do not have any matched profiles. We should see "profile:compute, boot_option:local" on 3 of the nodes and "profile_control, boot_option:local" on 1.

This was a packaging issue with the python-hardware library in our midstream repo and has been fixed.

Comment 4 jliberma@redhat.com 2015-04-13 18:48:14 UTC
thanks @trown

i will give that a whirl

Comment 5 jliberma@redhat.com 2015-04-13 19:22:26 UTC
@trown can you post instructions for rebuilding the ramdisk to test?

Comment 6 jliberma@redhat.com 2015-04-14 03:00:56 UTC
1. Tried these steps but it hung in same place:

 rm -rf discovery-ramdisk*
 instack-build-images discovery-ramdisk
 instack-prepare-for-overcloud

  Stalled in same place.


2. Rebuilt server from scratch and tried again. 

  Still stalled in same place:

[root@rhos0 ~]# instack-ironic-deployment --discover-nodes
/usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient.
  'python-keystoneclient.', DeprecationWarning)

Preparing for deployment...
  Discovering nodes.
    Sending node ID c681b512-a904-415d-a146-3d24633fa4f5 to discoverd for discovery ... DONE.
    Sending node ID 6c54837b-99b0-48f4-947e-78320657cb18 to discoverd for discovery ... DONE.
    Sending node ID 04e6f2c7-9af5-49e6-ad2c-482e014e8c99 to discoverd for discovery ... DONE.
    Sending node ID 43ddbead-13bc-4bb7-a73e-ef8b4a6f9144 to discoverd for discovery ... DONE.
   Polling discoverd for discovery results ... 
       Result for node c681b512-a904-415d-a146-3d24633fa4f5 is ...

3. Here are my repos:
delorean                                        delorean-openstack-neutron-483de6313fab5913f9e68eb24afe65c36bd9b623          114+150
delorean-rdo-management                         delorean-rdo-management-openstack-instack-undercloud-9ebb1cd62c8871153e89eb    83+25
epel/x86_64                                     Extra Packages for Enterprise Linux 7 - x86_64                              7,627+11
openstack-juno                                  OpenStack Juno Repository                                                    177+788
openstack-kilo                                  Temporary OpenStack Kilo new deps                                              41+30
rhel-7-server-extras-rpms/x86_64                Red Hat Enterprise Linux 7 Server - Extras (RPMs)                                 39
rhel-7-server-openstack-6.0-rpms/7Server/x86_64 Red Hat OpenStack 6.0 for RHEL 7 (RPMs)                                      277+312
rhel-7-server-optional-rpms/7Server/x86_64      Red Hat Enterprise Linux 7 Server - Optional (RPMs)                            5,744
rhel-7-server-rpms/7Server/x86_64               Red Hat Enterprise Linux 7 Server (RPMs)                                       6,815
!rhel-x86_64-server-7                           Red Hat Enterprise Linux Server (v. 7 for 64-bit x86_64)                       6,815

Comment 7 Jan Provaznik 2015-04-14 11:28:43 UTC
Created attachment 1014279 [details]
failed VM during node discovery

I can confirm the issue, rhel7.1 host, centos VMs/images. Attaching a screenshot from one of stuck VMs.

Comment 8 wes hayutin 2015-04-14 11:52:01 UTC
This is still an issue on fresh installs.. bumped priority, moving back to assigned. 

[stack@localhost ~]$ source stackrc 
[stack@localhost ~]$ nova list
+----+------+--------+------------+-------------+----------+
| ID | Name | Status | Task State | Power State | Networks |
+----+------+--------+------------+-------------+----------+
+----+------+--------+------------+-------------+----------+
[stack@localhost ~]$ ps -ef | grep ironic
ironic    1349     1  1 03:39 ?        00:03:35 /usr/bin/python2 /usr/bin/ironic-conductor
nobody    2580     1  0 03:40 ?        00:00:14 /sbin/dnsmasq --conf-file=/etc/ironic-discoverd/dnsmasq.conf
root      2586     1  0 03:40 ?        00:00:29 /usr/bin/python2 /usr/bin/ironic-discoverd --config-file /etc/ironic-discoverd/discoverd.conf
stack    19789 19788  0 07:11 pts/0    00:00:00 /bin/sh -c source /home/stack/stackrc; instack-ironic-deployment --discover-nodes;
stack    19793 19789  0 07:11 pts/0    00:00:00 /bin/bash /usr/bin/instack-ironic-deployment --discover-nodes
stack    27893 27773  0 07:50 pts/1    00:00:00 grep --color=auto ironic
ironic   30114     1  1 03:37 ?        00:03:00 /usr/bin/python2 /usr/bin/ironic-api

[stack@localhost ~]$ 
[stack@localhost ~]$ rpm -q python-pbr
python-pbr-0.10.8-1.el7.noarch
[stack@localhost ~]$ rpm -qa | grep ironic
python-ironic-discoverd-1.1.0-0.99.20150413.1600git.el7.centos.noarch
openstack-ironic-discoverd-1.1.0-0.99.20150413.1600git.el7.centos.noarch
openstack-ironic-api-2015.1-dev548.g60ceade.el7.centos.noarch
python-ironicclient-0.4.1.25-g3b171c5.el7.centos.noarch
openstack-ironic-conductor-2015.1-dev548.g60ceade.el7.centos.noarch
openstack-ironic-common-2015.1-dev548.g60ceade.el7.centos.noarch

Comment 9 wes hayutin 2015-04-14 11:57:15 UTC
after 5 hours or so.. you get

stdout: Preparing for deployment...
  Discovering nodes.
    Sending node ID 1407cd1f-0ba5-4f28-8c1a-dd5473de5762 to discoverd for discovery ... DONE.
    Sending node ID 29d42ba7-56e4-49a9-9ff1-6cbee25578b0 to discoverd for discovery ... DONE.
   Polling discoverd for discovery results ... 
       Result for node 1407cd1f-0ba5-4f28-8c1a-dd5473de5762 is ... ERROR: Introspection timeout
       Result for node 29d42ba7-56e4-49a9-9ff1-6cbee25578b0 is ... ERROR: Introspection timeout

Comment 10 Ronelle Landy 2015-04-14 13:18:16 UTC
I hit the same issue running on baremetal:

> instack-ironic-deployment --discover-nodes
/usr/lib/python2.7/site-packages/keystoneclient/shell.py:65: DeprecationWarning: The keystone CLI is deprecated in favor of python-openstackclient. For a Python library, continue using python-keystoneclient.
  'python-keystoneclient.', DeprecationWarning)
Preparing for deployment...
  Discovering nodes.
    Sending node ID eca6c28d-c853-4083-8bfd-771d3225d1da to discoverd for discovery ... DONE.
    Sending node ID 0d28569b-2c81-4418-bcfb-08ee5fabf550 to discoverd for discovery ... DONE.
    Sending node ID 827d87dc-5d05-4301-9cc6-38a2e1025204 to discoverd for discovery ... DONE.
    Sending node ID 5b409224-5a27-4f51-bc67-dbd4a13a4b67 to discoverd for discovery ... DONE.
    Sending node ID 9084b41c-3a18-4371-b37b-465d7315839f to discoverd for discovery ... DONE.
    Sending node ID 93e3a5cf-aaf5-460c-bf85-40cac5810129 to discoverd for discovery ... DONE.
   Polling discoverd for discovery results ... 
       Result for node eca6c28d-c853-4083-8bfd-771d3225d1da is ... ERROR: Introspection timeout
       Result for node 0d28569b-2c81-4418-bcfb-08ee5fabf550 is ... ERROR: Introspection timeout
       Result for node 827d87dc-5d05-4301-9cc6-38a2e1025204 is ... ERROR: Introspection timeout
       Result for node 5b409224-5a27-4f51-bc67-dbd4a13a4b67 is ... ERROR: Introspection timeout
       Result for node 9084b41c-3a18-4371-b37b-465d7315839f is ... ERROR: Introspection timeout
       Result for node 93e3a5cf-aaf5-460c-bf85-40cac5810129 is ... ERROR: Introspection timeout

Comment 11 Jeff Peeler 2015-04-14 14:51:33 UTC
FWIW, I've redeployed using virtual machines about an hour ago and the problem has been resolved for me.

Comment 12 jliberma@redhat.com 2015-04-14 14:58:25 UTC
Just saw Jeff Peeler's comment. I will redeploy and try again on bare metal.

Comment 13 jliberma@redhat.com 2015-04-14 23:49:34 UTC
The problem has not been resolved for me on baremetal.

I'm willing to provide whatever logs you need.

Comment 14 John Trowbridge 2015-04-16 12:14:54 UTC
@jliberma,

Are you sure you are hitting this issue? I am 99% the issue with the hardware package is resolved.[1] This bug is unfortunately not very specific, so there could be other reasons discovery hangs. I would be happy to help troubleshoot your environment and file additional more targeted bugs if needed. Let me know the status.

Respectfully,
John Trowbridge

[1] I am actually 100% sure, but nothing is 100%.

Comment 15 wes hayutin 2015-04-16 13:41:45 UTC
works for me now..
Thanks John

Comment 16 jliberma@redhat.com 2015-04-16 13:47:12 UTC
@trown

Yes I would certinly appreciate a second pair of eyes and any help you can offer.

I had to apply this hardware to another project but will circle back next week.

All of this worked for me after the sprint 3 release. I may have SIGHUP the hang but deployed the overcloud successfully.

I am eager to get past this and test the unified CLI commands. (instack replacement)

Thanks, Jacob

Comment 17 John Trowbridge 2015-05-04 15:55:04 UTC
This was a packaging issue with the python-hardware library in our midstream repo and has been fixed.


Note You need to log in before you can comment on or make changes to this bug.