RDO tickets are now tracked in Jira https://issues.redhat.com/projects/RDO/issues/
Bug 1209110 - Introspection times out after more than an hour
Summary: Introspection times out after more than an hour
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: RDO
Classification: Community
Component: openstack-ironic-discoverd
Version: Juno
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: Kilo
Assignee: Dmitry Tantsur
QA Contact: Toure Dunnon
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-06 08:27 UTC by Udi Kalifon
Modified: 2016-01-04 15:34 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2016-01-04 15:34:26 UTC
Embargoed:


Attachments (Terms of Use)
instackenv.json (1.09 KB, text/plain)
2015-04-06 08:27 UTC, Udi Kalifon
no flags Details
journal log (33.98 KB, text/x-vhdl)
2015-04-07 07:43 UTC, Udi Kalifon
no flags Details
screenshot (275.32 KB, image/png)
2015-04-08 14:49 UTC, Ola Pavlenko
no flags Details
journalctl (5.74 KB, text/x-vhdl)
2015-04-08 14:49 UTC, Ola Pavlenko
no flags Details

Description Udi Kalifon 2015-04-06 08:27:57 UTC
Created attachment 1011282 [details]
instackenv.json

Description of problem:
I installed the latest delorean packages (beginning of sprint 5) on a bare metal server and tried to detect the other bare metal nodes I have in my network. Introspection fails and there is nothing in /vat/log/messages to help me find out what the problem is.


Version-Release number of selected component (if applicable):
python-rdomanager-oscplugin-0.0.1-c2c9653.el7.centos.noarch


How reproducible:
100%


Steps to Reproduce:
1. instack-ironic-deployment --nodes-json instackenv.json --register-nodes
2. openstack baremetal introspection all start
3. openstack baremetal introspection all status


Actual results:
Introspection takes a very long time (I left it running over the weekend so I don't know exactly how much) and then times out:

openstack baremetal introspection all status
/usr/lib/python2.7/site-packages/novaclient/v1_1/__init__.py:30: UserWarning: Module novaclient.v1_1 is deprecated (taken as a basis for novaclient.v2). The preferable way to get client class or object you can find in novaclient.client module.
  warnings.warn("Module novaclient.v1_1 is deprecated (taken as a basis for "
+--------------------------------------+----------+-----------------------+
| Node UUID                            | Finished | Error                 |
+--------------------------------------+----------+-----------------------+
| 9419b513-3ac6-4b86-a236-81a67f94c1f1 | True     | Introspection timeout |
| 128db386-50e7-40ac-802e-ea6d3d674e98 | True     | Introspection timeout |
| d517ecfb-2656-4a82-816a-454e94aa2518 | True     | Introspection timeout |
| 3806c1b7-eec6-4c8c-b4dc-bbe4ba393255 | True     | Introspection timeout |
+--------------------------------------+----------+-----------------------+


Additional info:
My instackenv.json file is attached.

Comment 1 James Slagle 2015-04-06 15:58:31 UTC
Dmirty, can you work with Udi on this one to figure out what the issue is?

Comment 2 James Slagle 2015-04-06 15:58:50 UTC
Dmitry, can you work with Udi on this one to figure out what the issue is?

Comment 3 Dmitry Tantsur 2015-04-07 07:13:06 UTC
Hi! Is it possible you connect to the baremetal machines with the vendor remote console and make a screenshot of what is going on there? Also all logs are in journald, please provide output of $ sudo journalctl -u openstack-ironic-discoverd. Thanks.

Comment 4 Udi Kalifon 2015-04-07 07:43:41 UTC
Created attachment 1011635 [details]
journal log

Attaching the journal log. I am still looking for a way to open a console to the machines.

Comment 5 Ola Pavlenko 2015-04-08 10:23:57 UTC
Have the same behavior with VMs.

$ sudo journalctl -u openstack-ironic-discoverd
-- Logs begin at Thu 2015-04-02 09:50:28 EDT, end at Wed 2015-04-08 06:22:40 EDT. --
Apr 07 08:31:58 localhost.localdomain systemd[1]: Stopping Hardware introspection service for OpenStack Ironic...
Apr 07 08:31:58 localhost.localdomain systemd[1]: Starting Hardware introspection service for OpenStack Ironic...
Apr 07 08:31:58 localhost.localdomain systemd[1]: Started Hardware introspection service for OpenStack Ironic.
Apr 07 08:31:59 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug: * Running on http://0.0.0.0:5050/
Apr 07 08:35:51 localhost.localdomain ironic-discoverd[15616]: ERROR:ironic_discoverd.utils:Could not find node 904e28ad-40a0-45b9-befc-21388f615bc8 in cache
Apr 07 08:35:51 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [07/Apr/2015 08:35:51] "GET /v1/introspection/904e28ad-40a0-45b9-befc-21388f615bc8 HTTP/1.1" 404 -
Apr 07 08:35:59 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [07/Apr/2015 08:35:59] "POST /v1/introspection/904e28ad-40a0-45b9-befc-21388f615bc8 HTTP/1.1" 202 -
Apr 07 08:35:59 localhost.localdomain ironic-discoverd[15616]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:14:1c:9f:93:94'] for node 904e28ad-40a0-45b9-befc-21388f615bc8 on the firewall
Apr 07 08:36:00 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [07/Apr/2015 08:36:00] "POST /v1/introspection/f40a3388-c337-4083-bb5b-ca7b7a877728 HTTP/1.1" 202 -
Apr 07 08:36:00 localhost.localdomain ironic-discoverd[15616]: INFO:ironic_discoverd.introspect:Whitelisting MAC's [u'00:ef:d8:a0:58:80'] for node f40a3388-c337-4083-bb5b-ca7b7a877728 on the firewall
Apr 07 08:38:10 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [07/Apr/2015 08:38:10] "GET /v1/introspection/904e28ad-40a0-45b9-befc-21388f615bc8 HTTP/1.1" 200 -
Apr 07 08:38:10 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [07/Apr/2015 08:38:10] "GET /v1/introspection/f40a3388-c337-4083-bb5b-ca7b7a877728 HTTP/1.1" 200 -
Apr 07 08:41:34 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [07/Apr/2015 08:41:34] "GET /v1/introspection/904e28ad-40a0-45b9-befc-21388f615bc8 HTTP/1.1" 200 -
Apr 07 08:41:34 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [07/Apr/2015 08:41:34] "GET /v1/introspection/f40a3388-c337-4083-bb5b-ca7b7a877728 HTTP/1.1" 200 -
Apr 07 08:43:05 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [07/Apr/2015 08:43:05] "GET /v1/introspection/904e28ad-40a0-45b9-befc-21388f615bc8 HTTP/1.1" 200 -
Apr 07 08:43:05 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [07/Apr/2015 08:43:05] "GET /v1/introspection/f40a3388-c337-4083-bb5b-ca7b7a877728 HTTP/1.1" 200 -
Apr 07 09:35:59 localhost.localdomain ironic-discoverd[15616]: ERROR:ironic_discoverd.node_cache:Introspection for nodes [u'904e28ad-40a0-45b9-befc-21388f615bc8'] has timed out
Apr 07 09:36:59 localhost.localdomain ironic-discoverd[15616]: ERROR:ironic_discoverd.node_cache:Introspection for nodes [u'f40a3388-c337-4083-bb5b-ca7b7a877728'] has timed out
Apr 08 06:20:33 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [08/Apr/2015 06:20:33] "GET /v1/introspection/904e28ad-40a0-45b9-befc-21388f615bc8 HTTP/1.1" 200 -
Apr 08 06:20:33 localhost.localdomain ironic-discoverd[15616]: INFO:werkzeug:127.0.0.1 - - [08/Apr/2015 06:20:33] "GET /v1/introspection/f40a3388-c337-4083-bb5b-ca7b7a877728 HTTP/1.1" 200 -

Comment 6 Dmitry Tantsur 2015-04-08 11:32:09 UTC
Hi Ola! As you're on VM's it should be pretty easy to make a screenshot using virt-manager. Otherwise we can't figure out if it's the same problem.

Also Ola and Udi, please provide $ sudo journalctl -u openstack-ironic-discoverd-dnsmasq

Comment 7 Ola Pavlenko 2015-04-08 14:49:18 UTC
Created attachment 1012264 [details]
screenshot

Comment 8 Ola Pavlenko 2015-04-08 14:49:48 UTC
Created attachment 1012265 [details]
journalctl

Comment 9 Dmitry Tantsur 2015-04-08 15:02:07 UTC
The screenshot is not a screenshot of the machine booting, it's screenshot of the virt-manager itself. You can open a machine and get access to it's virtual screen.

However, the log does give some clues. What's in your /tftpboot directory? Did you follow https://repos.fedorapeople.org/repos/openstack-m/instack-undercloud/internal-html/build-images.html before starting discovery?

Comment 10 Udi Kalifon 2015-04-09 07:29:29 UTC
No vendor remote console for these machines, unfortunately. However our IT guy looked at the machine when it booted and saw the error: "no DHCP offers were received". We are currently investigating if the switch's ports are configured properly for the right vlan, but we need to wait till next week because all the relevant people are on Easter/Passover vacation...

Comment 11 Udi Kalifon 2016-01-04 15:34:26 UTC
Really an ancient bug, it doesn't happen any more. Closing.


Note You need to log in before you can comment on or make changes to this bug.