Cleaning nodes in overcloud boots the discovery image. Environment: python2-ironicclient-2.2.0-1.el7ost.noarch openstack-neutron-openvswitch-12.0.2-0.20180421011358.0ec54fd.el7ost.noarch python-ironic-lib-2.12.1-1.el7ost.noarch puppet-neutron-12.4.1-0.20180412211913.el7ost.noarch puppet-ironic-12.4.0-0.20180329034302.8285d85.el7ost.noarch openstack-neutron-ml2-12.0.2-0.20180421011358.0ec54fd.el7ost.noarch openstack-ironic-common-10.1.2-3.el7ost.noarch openstack-ironic-staging-drivers-0.9.0-4.el7ost.noarch python-ironic-inspector-client-3.1.1-1.el7ost.noarch instack-undercloud-8.4.1-3.el7ost.noarch python-neutron-12.0.2-0.20180421011358.0ec54fd.el7ost.noarch openstack-neutron-12.0.2-0.20180421011358.0ec54fd.el7ost.noarch openstack-ironic-api-10.1.2-3.el7ost.noarch openstack-ironic-inspector-7.2.1-0.20180409163359.2435d97.el7ost.noarch openstack-neutron-common-12.0.2-0.20180421011358.0ec54fd.el7ost.noarch python2-ironic-neutron-agent-1.0.0-1.el7ost.noarch openstack-ironic-conductor-10.1.2-3.el7ost.noarch python2-neutronclient-6.7.0-1.el7ost.noarch python2-neutron-lib-1.13.0-1.el7ost.noarch Steps to reproduce: 1. Deploy OC with ironic. 2. Attempt to clean BM instances in overcloud. Result: The instances boot the discovery image, instead of proper image for cleaning. If you reboot the node right away - it'll pick up the right image.
Note: ctlplane is 192.168.24.0 the network used for cleaning is also 192.168.24.0 The nic used for cleaning network is bridged to provisioning network.
[stack@undercloud-0 ~]$ sudo iptables -S|grep ironic -A INPUT -p tcp -m multiport --dports 6385,13385 -m state --state NEW -m comment --comment "135 ironic ipv4" -j ACCEPT -A INPUT -p tcp -m multiport --dports 5050 -m state --state NEW -m comment --comment "137 ironic-inspector ipv4" -j ACCEPT
[root@undercloud-0 ~]# ls /var/lib/ironic-inspector/dhcp-hostsdir/ 52:54:00:2a:67:76 52:54:00:5e:c7:b5 52:54:00:74:27:c2 52:54:00:7f:f8:b8 52:54:00:8f:af:fe 52:54:00:be:59:d7 52:54:00:cb:c5:52 52:54:00:cc:57:1b 52:54:00:e1:04:a1 52:54:00:f6:1f:4f (undercloud) [stack@undercloud-0 ~]$ openstack baremetal node list +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ | abc92400-280e-42e4-a28a-effd26a302e1 | ceph-0 | d928abbe-7621-4aac-9fc3-5faadfc1d784 | power on | active | False | | 1e8ad13f-3a27-4bd6-bfdd-508a85ce1515 | ceph-1 | 1c57e49d-07d7-4b54-bc59-71e29a6aa702 | power on | active | False | | ecaa81b8-425b-487c-b4ab-ff2adad791a6 | ceph-2 | 2804207e-b746-4b32-ab20-525c1e114d21 | power on | active | False | | d4fd3e45-b9b2-4658-8aec-bc5f0d59c1f7 | compute-0 | 18e868b4-c5c0-4d94-ae90-5b968b307adc | power on | active | False | | df51d847-dbd2-40ba-a72e-76456de2ac93 | compute-1 | 3d8bc413-ecda-416d-abc8-ad0ede16774c | power on | active | False | | 02c044b9-1c10-46ab-b402-9a54f01661a0 | controller-0 | bd7dd5a3-38b8-4016-a952-77f1ebc45c39 | power on | active | False | | 00ffb38e-3901-4560-a8e3-4d45e6abc547 | controller-1 | ea92ebd2-5aab-4bb2-8c0f-57dd3bb7b596 | power on | active | False | | d701d7a1-8b74-4701-a1a2-31698f5f8afc | controller-2 | 20e8d6a5-2f65-4636-9c82-e677502c9c70 | power on | active | False | +--------------------------------------+--------------+--------------------------------------+-------------+--------------------+-------------+ sources stackrc: for node in `openstack baremetal node list -f value -c Name`; do echo $node; openstack baremetal port list --node $node -f value -c Address; done ceph-0 52:54:00:cb:c5:52 ceph-1 52:54:00:be:59:d7 ceph-2 52:54:00:cc:57:1b compute-0 52:54:00:5e:c7:b5 compute-1 52:54:00:e1:04:a1 controller-0 52:54:00:8f:af:fe controller-1 52:54:00:7f:f8:b8 controller-2 52:54:00:f6:1f:4f sourced overcloudrc: (overcloud) [stack@undercloud-0 ~]$ for node in `openstack baremetal node list -f value -c Name`; do echo $node; openstack baremetal port list --node $node -f value -c Address; done ironic-0 52:54:00:2a:67:76 ironic-1 52:54:00:74:27:c2 (overcloud) [stack@undercloud-0 ~]$
(In reply to Alexander Chuzhoy from comment #1) > Note: > ctlplane is 192.168.24.0 > the network used for cleaning is also 192.168.24.0 > The nic used for cleaning network is bridged to provisioning network. So what we are seeing is the overcloud's baremetal nodes are booting an image from the inspector service on the undercloud? The interfaces used for Ironic in the overcloud cannot be on the same L2 network as the undercloud. It is possible this got worse with the dnsmasq driver, but even with the iptables driver this would not be without issues. With the iptables driver: * Undercloud cloud operator initiates introspection. * Overcloud tenant initiates Ironic operation while undercloud introspection is running. __Result: We have a race. If the undercloud DHCP server responds first, the overcloud tenant operation will fail. With the dnsmasq driver: * We no longer filter dhcp requests by default. * No matter if the cloud operatator has initiates inspection or not. * Overcloud tenant initiates Ironic operation while undercloud introspection is running. __Result: We have a race. If the undercloud DHCP server responds first, the overcloud tenant operation will fail.
Verified: Version: openstack-ironic-inspector-7.2.1-0.20180409163360.el7ost.noarch Was able to clean the nodes in overcloud.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086