Description of problem: During overcloud deployment stage of OSP14 VM's are stuck in SPAWNING state indefinitely: nova list +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------+--------+------------+-------------+------------------------+ | 4a338c02-8243-41d5-9f68-306a7ab6077f | compute-0 | BUILD | spawning | NOSTATE | ctlplane=192.168.24.7 | | 7a77b781-d853-4aee-9868-cf774096732a | controller-0 | BUILD | spawning | NOSTATE | ctlplane=192.168.24.10 | +--------------------------------------+--------------+--------+------------+-------------+------------------------+ Looking at VM console, looks like problem with ironic-inspector not being able to find certain parameters for node(s) (screenshots attached): The following failures happened during running pre-processing hooks: Look up error: Could not find a node for attributes {'bmc_address': None, 'mac': [u'52:54:00:38:14:f0', u'52:54:00:bb:df:15', u'52:54:00:7e:9b:6f']} ... stopping inspection, as inspector returned an error ... and /var/log/ironic-inspector/ironic-inspector.log on undercloud reports similar issue: 2018-04-27 05:00:04.951 17111 DEBUG ironic_inspector.node_cache [req-c16d82be-84ff-4a1f-83fa-f0327fdee365 - - - - -] Empty value for attribute bmc_address find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:803 2018-04-27 05:00:04.951 17111 DEBUG ironic_inspector.node_cache [req-c16d82be-84ff-4a1f-83fa-f0327fdee365 - - - - -] Trying to use mac of value [u'52:54:00:38:14:f0', u'52:54:00:bb:df:15', u'52:54:00:7e:9b:6f'] for node look up find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:809 2018-04-27 05:00:04.955 17111 INFO ironic_inspector.utils [req-c16d82be-84ff-4a1f-83fa-f0327fdee365 - - - - -] Could not find a node for attributes {'bmc_address': None, 'mac': [u'52:54:00:38:14:f0', u'52:54:00:bb:df:15', u'52:54:00:7e:9b:6f']} 2018-04-27 05:00:04.956 17111 INFO ironic_inspector.process [req-c16d82be-84ff-4a1f-83fa-f0327fdee365 - - - - -] [node: MAC 52:54:00:7e:9b:6f] Ramdisk logs were stored in file unknown_20180427-090004.956026.tar.gz 2018-04-27 05:00:04.957 17111 ERROR ironic_inspector.utils [req-c16d82be-84ff-4a1f-83fa-f0327fdee365 - - - - -] [node: MAC 52:54:00:7e:9b:6f] The following failures happened during running pre-processing hooks: Look up error: Could not find a node for attributes {'bmc_address': None, 'mac': [u'52:54:00:38:14:f0', u'52:54:00:bb:df:15', u'52:54:00:7e:9b:6f']} 2018-04-27 05:00:04.957 17111 DEBUG ironic_inspector.main [req-c16d82be-84ff-4a1f-83fa-f0327fdee365 - - - - -] Returning error to client: The following failures happened during running pre-processing hooks: Look up error: Could not find a node for attributes {'bmc_address': None, 'mac': [u'52:54:00:38:14:f0', u'52:54:00:bb:df:15', u'52:54:00:7e:9b:6f']} error_response /usr/lib/python2.7/site-packages/ironic_inspector/main.py:68 2018-04-27 05:00:04.958 17111 INFO werkzeug [req-c16d82be-84ff-4a1f-83fa-f0327fdee365 - - - - -] 192.168.24.10 - - [27/Apr/2018 05:00:04] "POST /v1/continue HTTP/1.1" 400 - 2018-04-27 05:00:10.297 17111 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic_inspector.pxe_filter.base.periodic_sync_task' _process_scheduled /usr/lib/python2.7/site-packages/futurist/periodics.py:639 2018-04-27 05:00:10.298 17111 DEBUG ironic_inspector.pxe_filter.base [-] The PXE filter driver DnsmasqFilter, state=initialized enters the fsm_reset_on_error context fsm_reset_on_error /usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/base.py:137 2018-04-27 05:00:10.299 17111 DEBUG ironic_inspector.pxe_filter.dnsmasq [-] Syncing the driver _sync /usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/dnsmasq.py:65 2018-04-27 05:00:10.339 17111 DEBUG ironic_inspector.pxe_filter.dnsmasq [-] The dnsmasq PXE filter was synchronized (took 0:00:00.039804) _sync /usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/dnsmasq.py:83 2018-04-27 05:00:10.339 17111 DEBUG ironic_inspector.pxe_filter.base [-] The PXE filter driver DnsmasqFilter, state=initialized left the fsm_reset_on_error context fsm_reset_on_error /usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/base.py:153 Version-Release number of selected component (if applicable): Puddle 2018-04-18.1 How reproducible: 100% Steps to Reproduce: 1. Deploy OSP14 virthost topology with sized as uc:1,contr:1,comp:1,ceph:1 Actual results: OC deployment is stuck until killed (or possibly timed out) Additional info: $ rpm -qa | grep ironic openstack-ironic-common-10.2.0-0.20180328191656.34e9d7d.el7.noarch python-ironic-inspector-client-3.1.1-1.el7ost.noarch puppet-ironic-12.4.0-0.20180329034302.8285d85.el7ost.noarch python2-ironic-neutron-agent-1.0.1-0.20180313084054.dcf4081.el7.noarch openstack-ironic-conductor-10.2.0-0.20180328191656.34e9d7d.el7.noarch python-ironic-lib-2.12.0-1.el7ost.noarch openstack-ironic-staging-drivers-0.9.1-0.20180326182014.d27bcf0.el7.noarch openstack-ironic-api-10.2.0-0.20180328191656.34e9d7d.el7.noarch openstack-ironic-inspector-7.2.1-0.20180409163359.2435d97.el7ost.noarch python2-ironicclient-2.3.0-0.20180315153520.964556b.el7.noarch $ rpm -qa | grep tripleo puppet-tripleo-8.3.2-0.20180404194930.af2c34c.el7ost.noarch openstack-tripleo-common-8.6.1-0.20180410041244.7d3c20d.el7ost.noarch openstack-tripleo-puppet-elements-8.0.0-1.el7ost.noarch python-tripleoclient-9.2.0-2.el7ost.noarch openstack-tripleo-image-elements-8.0.0-3.el7ost.noarch openstack-tripleo-ui-8.3.1-2.el7ost.noarch ansible-tripleo-ipsec-8.1.1-0.20180319102338.7a98f66.el7.noarch openstack-tripleo-heat-templates-8.0.2-0.20180410061339.b937f35.el7ost.noarch openstack-tripleo-validations-8.4.0-1.el7ost.noarch openstack-tripleo-common-containers-8.6.1-0.20180410041244.7d3c20d.el7ost.noarch
Created attachment 1427596 [details] console output of controller VM
Created attachment 1427597 [details] console output of compute VM
Created attachment 1427598 [details] ironic-inspector.log on undercloud
From the inspector logs: Lookup initialization: 2018-04-27 04:39:54.834 17111 INFO ironic_inspector.introspect [-] [node: e244f74b-ddad-453c-bcb0-586da05bc647 state starting] The following attributes will be used for look up: {u'bmc_address': [u'172.16.0.1'], u'mac': [u'52:54:00:7a:c1:8a']} 2018-04-27 04:39:59.824 17111 INFO ironic_inspector.introspect [-] [node: 9347f429-0932-4899-b711-b7b5ce96b1b2 state starting] The following attributes will be used for look up: {u'bmc_address': [u'172.16.0.1'], u'mac': [u'52:54:00:e0:48:39']} 2018-04-27 04:40:04.825 17111 INFO ironic_inspector.introspect [-] [node: b5694fa4-15a1-4714-a329-3d68f066ce99 state starting] The following attributes will be used for look up: {u'bmc_address': [u'172.16.0.1'], u'mac': [u'52:54:00:7e:9b:6f']} First node lookup: 2018-04-27 04:41:39.922 17111 DEBUG ironic_inspector.node_cache [req-53e62732-d5e8-4952-b3ea-eece7ad47fd5 - - - - -] Trying to use mac of value [u'52:54:00:8c:a3:8a', u'52:54:00:ff:01:f6', u'52:54:00:7a:c1:8a'] for node look up find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:809 2018-04-27 04:41:39.925 17111 DEBUG ironic_inspector.node_cache [req-53e62732-d5e8-4952-b3ea-eece7ad47fd5 - - - - -] The following nodes match the attributes: bmc_address=None, mac=[u'52:54:00:8c:a3:8a', u'52:54:00:ff:01:f6', u'52:54:00:7a:c1:8a'], scoring: e244f74b-ddad-453c-bcb0-586da05bc647: 1 find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:828 Second node lookup: 2018-04-27 04:41:43.857 17111 DEBUG ironic_inspector.node_cache [req-5dd8fd63-b8c2-49a8-8be2-7e3baed87193 - - - - -] Trying to use mac of value [u'52:54:00:74:ff:66', u'52:54:00:7c:e8:2b', u'52:54:00:e0:48:39'] for node look up find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:809 2018-04-27 04:41:43.861 17111 DEBUG ironic_inspector.node_cache [req-5dd8fd63-b8c2-49a8-8be2-7e3baed87193 - - - - -] The following nodes match the attributes: bmc_address=None, mac=[u'52:54:00:74:ff:66', u'52:54:00:7c:e8:2b', u'52:54:00:e0:48:39'], scoring: 9347f429-0932-4899-b711-b7b5ce96b1b2: 1 find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:828 Third node lookup: 2018-04-27 04:41:49.904 17111 DEBUG ironic_inspector.node_cache [req-0b43bdac-6866-477c-b634-039885f70d30 - - - - -] Trying to use mac of value [u'52:54:00:38:14:f0', u'52:54:00:bb:df:15', u'52:54:00:7e:9b:6f'] for node look up find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:809 2018-04-27 04:41:49.907 17111 DEBUG ironic_inspector.node_cache [req-0b43bdac-6866-477c-b634-039885f70d30 - - - - -] The following nodes match the attributes: bmc_address=None, mac=[u'52:54:00:38:14:f0', u'52:54:00:bb:df:15', u'52:54:00:7e:9b:6f'], scoring: b5694fa4-15a1-4714-a329-3d68f066ce99: 1 find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:828 Then in 20 mins we have failed lookup for two nodes that had already been introspected: 2018-04-27 05:00:04.951 17111 DEBUG ironic_inspector.node_cache [req-c16d82be-84ff-4a1f-83fa-f0327fdee365 - - - - -] Trying to use mac of value [u'52:54:00:38:14:f0', u'52:54:00:bb:df:15', u'52:54:00:7e:9b:6f'] for node look up find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:809 2018-04-27 05:00:04.955 17111 INFO ironic_inspector.utils [req-c16d82be-84ff-4a1f-83fa-f0327fdee365 - - - - -] Could not find a node for attributes {'bmc_address': None, 'mac': [u'52:54:00:38:14:f0', u'52:54:00:bb:df:15', u'52:54:00:7e:9b:6f']} 2018-04-27 05:02:40.048 17111 DEBUG ironic_inspector.node_cache [req-fd035ec9-477b-432e-a649-922f43fd067e - - - - -] Trying to use mac of value [u'52:54:00:74:ff:66', u'52:54:00:7c:e8:2b', u'52:54:00:e0:48:39'] for node look up find_node /usr/lib/python2.7/site-packages/ironic_inspector/node_cache.py:809 2018-04-27 05:02:40.052 17111 INFO ironic_inspector.utils [req-fd035ec9-477b-432e-a649-922f43fd067e - - - - -] Could not find a node for attributes {'bmc_address': None, 'mac': [u'52:54:00:74:ff:66', u'52:54:00:7c:e8:2b', u'52:54:00:e0:48:39']} It seems that for some reason the nodes ended up going through introspection instead of deployment
Can we get logs from the openstack-ironic-inspector-dnsmasq systemd unit please?
Sure, attached tarball of ironic-related logs from /var/log and *ironic* systemd units osp14-bz-1572537-logs.tar.gz .
Created attachment 1427607 [details] ironic-related journalctl and var/log logs
Note: osp14-bz-1572537-logs.tar.gz is collected from different deployment than original ironic-inspector.log was collected, but this issue reproduced there and original deployment is gone.
type=AVC msg=audit(1524708610.595:1897): avc: denied { read } for pid=16813 comm="dnsmasq" name="52:54:00:8c:f3:1c" dev="vda1" ino=92323765 scontext=system_u:system_r:dnsmasq_t:s0 tcontext=system_u:object_r:var_lib_t:s0 tclass=file type=SYSCALL msg=audit(1524708610.595:1897): arch=c000003e syscall=2 success=no exit=-13 a0=55c9c98e7920 a1=0 a2=1b6 a3=24 items=0 ppid=1 pid=16813 auid=4294967295 uid=99 gid=99 euid=99 suid=99 fsuid=99 egid=99 sgid=99 fsgid=99 tty=(none) ses=4294967295 comm="dnsmasq" exe="/usr/sbin/dnsmasq" subj=system_u:system_r:dnsmasq_t:s0 key=(null) This is probably a duplicate of: https://bugzilla.redhat.com/show_bug.cgi?id=1568993
Apr 27 04:39:40 undercloud-0.redhat.local dnsmasq[17193]: inotify, new or changed file /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:e0:48:39 Apr 27 04:39:40 undercloud-0.redhat.local ironic-inspector[17111]: 2018-04-27 04:39:40.222 17111 DEBUG ironic_inspector.pxe_filter.dnsmasq [-] Blacklisted 52:54:00:e0:48:39 _blacklist_mac /usr/lib/python2.7/site Apr 27 04:39:40 undercloud-0.redhat.local ironic-inspector[17111]: 2018-04-27 04:39:40.222 17111 DEBUG ironic_inspector.pxe_filter.dnsmasq [-] Blacklisted 52:54:00:7a:c1:8a _blacklist_mac /usr/lib/python2.7/site Apr 27 04:39:40 undercloud-0.redhat.local ironic-inspector[17111]: 2018-04-27 04:39:40.223 17111 DEBUG ironic_inspector.pxe_filter.dnsmasq [-] Blacklisted 52:54:00:7e:9b:6f _blacklist_mac /usr/lib/python2.7/site Apr 27 04:39:40 undercloud-0.redhat.local ironic-inspector[17111]: 2018-04-27 04:39:40.223 17111 DEBUG ironic_inspector.pxe_filter.dnsmasq [-] The dnsmasq PXE filter was synchronized (took 0:00:00.039320) _sync Apr 27 04:39:40 undercloud-0.redhat.local ironic-inspector[17111]: 2018-04-27 04:39:40.223 17111 DEBUG ironic_inspector.pxe_filter.base [-] The PXE filter driver DnsmasqFilter, state=initialized left the fsm_res Apr 27 04:39:40 undercloud-0.redhat.local dnsmasq-dhcp[17193]: read /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:e0:48:39 Apr 27 04:39:40 undercloud-0.redhat.local dnsmasq[17193]: cannot read /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:e0:48:39: Permission denied Apr 27 04:39:40 undercloud-0.redhat.local dnsmasq[17193]: inotify, new or changed file /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:7a:c1:8a Apr 27 04:39:40 undercloud-0.redhat.local dnsmasq-dhcp[17193]: read /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:7a:c1:8a Apr 27 04:39:40 undercloud-0.redhat.local dnsmasq[17193]: cannot read /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:7a:c1:8a: Permission denied Apr 27 04:39:40 undercloud-0.redhat.local dnsmasq[17193]: inotify, new or changed file /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:7e:9b:6f Apr 27 04:39:40 undercloud-0.redhat.local dnsmasq-dhcp[17193]: read /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:7e:9b:6f Apr 27 04:39:40 undercloud-0.redhat.local dnsmasq[17193]: cannot read /var/lib/ironic-inspector/dhcp-hostsdir/52:54:00:7e:9b:6f: Permission denied
Marking this as duplicate. Inspector Dnsmasq is denied permissions to read the pxe_filter files because of selinux policy. Since dnsmasq cannot read the files, it does not filter the nodes and they end up being introspected again. workaround: setenforce Permissive *** This bug has been marked as a duplicate of bug 1568993 ***
Note that the fix for this issue is in openstack-selinux-0.8.14-2.el7ost, which is in Puddle 2018-04-18.2. It looks like you are using Puddle 2018-04-18.1, which is why you are hitting this problem.
Note that last comment refers to OSP-13, and this bug is for OSP-14.
Looks like this BZ was closed accidentally. As possibly this might be clone (not duplicate) of https://bugzilla.redhat.com/show_bug.cgi?id=1568993 targeted for OSP13, it doesn't change fact that we don't have this issue tracked for 14. Reopening.
I can confirm that deleting overcloud, setting selinux to Permissive manually, restart of ironic/nova/heat services and OC redeployment workarounds this problem (old openstack-selinux-0.8.14-1.el7ost.noarch is present in system) and OC deployment passes (also, no ironic denials in audit.log and issue mentioned above in ironic-inspector.log).
I verified manually that upgrading undercloud (and node restart) to openstack-selinux-0.8.14-2.el7ost resolved this issue and OC deployment is able to pass.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045