Bug 1240260 - introspection timed out for 2 VM nodes
Summary: introspection timed out for 2 VM nodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tuskar-ui
Version: Director
Hardware: Unspecified
OS: Linux
high
urgent
Target Milestone: y1
: 7.0 (Kilo)
Assignee: Jiri Tomasek
QA Contact: Ola Pavlenko
URL:
Whiteboard:
Depends On:
Blocks: 1228132
TreeView+ depends on / blocked
 
Reported: 2015-07-06 11:32 UTC by Ola Pavlenko
Modified: 2023-02-22 23:02 UTC (History)
9 users (show)

Fixed In Version: openstack-tuskar-ui-0.4.0-2.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-08 12:10:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
nodes csv file was used (3.37 KB, text/plain)
2015-07-06 11:36 UTC, Ola Pavlenko
no flags Details
dnxmasq (4.55 KB, text/x-vhdl)
2015-07-09 10:11 UTC, Ola Pavlenko
no flags Details
journal openstack-ironic-discoverd (6.76 KB, text/x-vhdl)
2015-07-09 10:12 UTC, Ola Pavlenko
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 200045 0 None None None Never
Red Hat Product Errata RHSA-2015:1862 0 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux OpenStack Platform 7 director update 2015-10-08 16:05:50 UTC

Description Ola Pavlenko 2015-07-06 11:32:39 UTC
Description of problem:
introspection process failed: "discovery failed"
actually it timed out


Version-Release number of selected component (if applicable):
---------------------------------------
# rpm -qa | grep tuskar
python-tuskarclient-0.1.18-3.el7ost.noarch
openstack-tuskar-0.4.18-3.el7ost.noarch
openstack-tuskar-ui-extras-0.0.4-1.el7ost.noarch
openstack-tuskar-ui-0.3.0-8.el7ost.noarch

# rpm -qa | grep ironic
openstack-ironic-common-2015.1.0-8.el7ost.noarch
openstack-ironic-discoverd-1.1.0-4.el7ost.noarch
python-ironic-discoverd-1.1.0-4.el7ost.noarch
openstack-ironic-conductor-2015.1.0-8.el7ost.noarch
openstack-ironic-api-2015.1.0-8.el7ost.noarch
python-ironicclient-0.5.1-9.el7ost.noarch

# rpm -qa | grep rhos
rhos-release-0.63-1.noarch

How reproducible:
--------------------------
100%

Steps to Reproduce:
-------------------------
1.register nodes via UI
2.introspection will implicitly start 

Actual results:
------------------------
introspection fails after an hour approximately
with "discovery failed"

Expected results:
-------------------------
introspection succeed


Additional info:
------------------------

sudo journalctl -u openstack-ironic-discoverd:

-- Logs begin at Sun 2015-07-05 15:08:12 EDT, end at Mon 2015-07-06 07:12:43 EDT. --
Jul 05 15:45:08 instack.localdomain systemd[1]: Starting Hardware introspection service for OpenStack Ironic...
Jul 05 15:45:08 instack.localdomain systemd[1]: Started Hardware introspection service for OpenStack Ironic.
Jul 05 15:45:11 instack.localdomain ironic-discoverd[31536]: INFO:ironic_discoverd.main:Enabled processing hooks: ['ramdisk_error'
, 'root_device_hint', 'scheduler', 'validate_interfaces', 'edeploy']
Jul 05 15:45:11 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug: * Running on http://0.0.0.0:5050/
Jul 06 04:11:50 instack.localdomain ironic-discoverd[31536]: INFO:ironic_discoverd.utils:Node 2233257b-ec0d-4caf-8d49-e45e0a46d527
 is in maintenance mode, skipping provision states check
Jul 06 04:11:50 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:11:50] "POST /v1/introspe
ction/2233257b-ec0d-4caf-8d49-e45e0a46d527 HTTP/1.1" 202 -
Jul 06 04:11:50 instack.localdomain ironic-discoverd[31536]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:4d:a2:4a:92
:4d'] for node 2233257b-ec0d-4caf-8d49-e45e0a46d527 on the firewall
Jul 06 04:11:51 instack.localdomain ironic-discoverd[31536]: INFO:ironic_discoverd.utils:Node b15fb49b-0d75-49b1-b3f1-78ceccb2d60d
 is in maintenance mode, skipping provision states check
Jul 06 04:11:51 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:11:51] "POST /v1/introspe
ction/b15fb49b-0d75-49b1-b3f1-78ceccb2d60d HTTP/1.1" 202 -
Jul 06 04:11:52 instack.localdomain ironic-discoverd[31536]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:cc:7e:44:c3
:b0'] for node b15fb49b-0d75-49b1-b3f1-78ceccb2d60d on the firewall
Jul 06 04:11:53 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:11:53] "GET /v1/introspec
tion/2233257b-ec0d-4caf-8d49-e45e0a46d527 HTTP/1.1" 200 -
Jul 06 04:11:53 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:11:53] "GET /v1/introspec
tion/b15fb49b-0d75-49b1-b3f1-78ceccb2d60d HTTP/1.1" 200 -
Jul 06 04:12:53 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:12:53] "GET /v1/introspec
tion/2233257b-ec0d-4caf-8d49-e45e0a46d527 HTTP/1.1" 200 -
Jul 06 04:12:53 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:12:53] "GET /v1/introspec
tion/b15fb49b-0d75-49b1-b3f1-78ceccb2d60d HTTP/1.1" 200 -
Jul 06 04:16:30 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:16:30] "GET /v1/introspec
tion/2233257b-ec0d-4caf-8d49-e45e0a46d527 HTTP/1.1" 200 -
Jul 06 04:16:30 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:16:30] "GET /v1/introspec
tion/b15fb49b-0d75-49b1-b3f1-78ceccb2d60d HTTP/1.1" 200 -
Jul 06 04:16:35 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:16:35] "GET /v1/introspec
tion/2233257b-ec0d-4caf-8d49-e45e0a46d527 HTTP/1.1" 200 -
Jul 06 04:16:35 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:16:35] "GET /v1/introspec
tion/b15fb49b-0d75-49b1-b3f1-78ceccb2d60d HTTP/1.1" 200 -
Jul 06 04:28:36 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:28:36] "GET /v1/introspec
tion/2233257b-ec0d-4caf-8d49-e45e0a46d527 HTTP/1.1" 200 -
Jul 06 04:28:36 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:28:36] "GET /v1/introspec
tion/b15fb49b-0d75-49b1-b3f1-78ceccb2d60d HTTP/1.1" 200 -
Jul 06 04:50:21 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:50:21] "GET /v1/introspec
tion/2233257b-ec0d-4caf-8d49-e45e0a46d527 HTTP/1.1" 200 -
Jul 06 04:50:21 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 04:50:21] "GET /v1/introspec
tion/b15fb49b-0d75-49b1-b3f1-78ceccb2d60d HTTP/1.1" 200 -
Jul 06 05:12:13 instack.localdomain ironic-discoverd[31536]: ERROR:ironic_discoverd.node_cache:Introspection for nodes [u'2233257b-ec0d-4caf-8d49-e45e0a46d527', u'b15fb49b-0d75-49b1-b3f1-78ceccb2d60d'] has timed out
Jul 06 05:19:18 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 05:19:18] "GET /v1/introspection/2233257b-ec0d-4caf-8d49-e45e0a46d527 HTTP/1.1" 200 -
Jul 06 05:19:18 instack.localdomain ironic-discoverd[31536]: INFO:werkzeug:192.0.2.1 - - [06/Jul/2015 05:19:18] "GET /v1/introspection/b15fb49b-0d75-49b1-b3f1-78ceccb2d60d HTTP/1.1" 200 -
~

Comment 3 Ola Pavlenko 2015-07-06 11:36:26 UTC
Created attachment 1048798 [details]
nodes csv file was used

Comment 5 chris alfonso 2015-07-06 16:18:03 UTC
Does this happen on baremetal?

Comment 7 Dmitry Tantsur 2015-07-07 10:49:41 UTC
Please also provide output of $ sudo journalctl -u openstack-ironic-discoverd-dnsmasq

Also the remaining steps of https://github.com/openstack/ironic-inspector#introspection-times-out

Comment 8 Dmitry Tantsur 2015-07-08 15:05:52 UTC
Were several NIC's used? If so, might be case of https://bugzilla.redhat.com/show_bug.cgi?id=1234601

Comment 9 Ola Pavlenko 2015-07-08 15:52:25 UTC
(In reply to Dmitry Tantsur from comment #7)
> Please also provide output of $ sudo journalctl -u
> openstack-ironic-discoverd-dnsmasq
> 
> Also the remaining steps of
> https://github.com/openstack/ironic-inspector#introspection-times-out

Unfortunately I've used this env for the test day (e.g reprovisioned)
I'll reproduce it and provide all requested info.

Comment 12 Ola Pavlenko 2015-07-09 10:11:21 UTC
Created attachment 1050209 [details]
dnxmasq

Reproduce on lynx17.qa.lab.tlv.redhat.com / qum10net with 2 nodes when 1 was successfully discovered and the second one failed.

Comment 13 Ola Pavlenko 2015-07-09 10:12:08 UTC
Created attachment 1050210 [details]
journal openstack-ironic-discoverd

Comment 14 Dmitry Tantsur 2015-07-09 11:12:43 UTC
Ah, missing sleep again. Meh, I'm going to do something about it upstream, but for current release we need sleeps or similar measure in UI.

UI folks, could you please make sure we wait say 5 seconds between sending nodes on introspection?

Comment 15 Jiri Tomasek 2015-07-09 12:31:30 UTC
Upstream patch is available here: https://review.openstack.org/#/c/200045/

Comment 17 Ola Pavlenko 2015-09-08 15:35:41 UTC
verified  openstack-tuskar-ui-0.4.0-3.el7ost.noarch

Comment 19 errata-xmlrpc 2015-10-08 12:10:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:1862


Note You need to log in before you can comment on or make changes to this bug.