Bug 1575782 - deleted node kept booting the introspection image
Summary: deleted node kept booting the introspection image
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-ironic
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z3
: 13.0 (Queens)
Assignee: Harald Jensås
QA Contact: mlammon
URL:
Whiteboard:
Depends On:
Blocks: 1633746
TreeView+ depends on / blocked
 
Reported: 2018-05-07 23:45 UTC by Alexander Chuzhoy
Modified: 2018-11-13 22:15 UTC (History)
7 users (show)

Fixed In Version: openstack-ironic-10.1.6-1.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1633746 (view as bug list)
Environment:
Last Closed: 2018-11-13 22:14:54 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 566757 0 None MERGED PXE Filter dnsmasq: manage macs not in ironic 2020-12-18 22:55:26 UTC
OpenStack gerrit 580931 0 None MERGED PXE Filter dnsmasq: manage macs not in ironic 2020-12-18 22:55:28 UTC
OpenStack gerrit 598229 0 None MERGED Add start/stop command for ironic-inspector-dnsmasq 2020-12-18 22:55:26 UTC
OpenStack gerrit 598433 0 None MERGED Add rootwrap filter for systemctl control of dnsmasq 2020-12-18 22:55:28 UTC
Red Hat Product Errata RHBA-2018:3605 0 None None None 2018-11-13 22:15:31 UTC

Description Alexander Chuzhoy 2018-05-07 23:45:21 UTC
Deleting introspected nodes from ironic does not clean entries under /var/lib/ironic-inspector/dhcp-hostsdir/

Environment:
python2-ironicclient-2.2.0-1.el7ost.noarch
python-ironic-lib-2.12.1-1.el7ost.noarch
puppet-ironic-12.4.0-0.20180329034302.8285d85.el7ost.noarch
openstack-ironic-common-10.1.2-3.el7ost.noarch
openstack-ironic-staging-drivers-0.9.0-4.el7ost.noarch
python-ironic-inspector-client-3.1.1-1.el7ost.noarch
instack-undercloud-8.4.1-4.el7ost.noarch
openstack-ironic-api-10.1.2-3.el7ost.noarch
python2-ironic-neutron-agent-1.0.0-1.el7ost.noarch
openstack-ironic-conductor-10.1.2-3.el7ost.noarch
openstack-ironic-inspector-7.2.1-0.20180409163359.2435d97.el7ost.noarch

Steps to reproduce:
1. Import nodes in undercloud and introspect them
2. Delete one or more nodes from ironic.
3. List files under /var/lib/ironic-inspector/dhcp-hostsdir/

Result:
The files named after macs of removed nodes are still there.

Expected result:
The files named after macs of removed nodes should NOT be there.

Comment 1 Harald Jensås 2018-05-09 18:17:19 UTC
I have abandoned the proposed upstream fix, as it does not work.

Abandoned

*facepalm* Host records are only added dynamically. We would have to SIGHUP the process to capture the removal of an entry.

Since Ironic-Inspector does not spawn the dnsmasq service doing SIGHUP to reload does not seem feasable.


I think we have to close this with WONTFIX or CANTFIX?
The workaround is to purge the dhcp-hostsdir and reload the dnsmasq service.

Comment 3 Harald Jensås 2018-05-09 18:47:54 UTC
So the scenario:

- We add a node to ironic and introspects it.
- The baremtal port (MAC) is created for the node.
- the MAC is added to the blacklist
- We remove the node from ironic 

ISSUE: we cannot simply delete or trunkate the filter file. Because dnsmasq only adds records dynamically. And it does not keep any state for which record is in each file. So trunkating the file would leave the filter in dnsmasq until it is reloaded/restarted.

Options:

 a) We keep the nodes blacklisted.

Pro: Removed nodes would not boot intrsopection image.

Con: Adding the mac address would be required if re-enrolling the node in ironic. It is supported not to add the mac address, and have that discovered during introspection instead.

 b) We whitelist nodes not in ironic anymore. (*Curent behaviour*)

Pro: The node can be re-enrolled and introspected.
Con: Any that was enrolled and later removed from Ironic will boot the inspection image.


Neiter a) nor b) are good solutions, so we need to come up with option c) that actually works.

Comment 4 Harald Jensås 2018-05-09 23:44:54 UTC
I have restored the proposed patch upstream and re-worked it.

 * Blacklist all mac's no longer in Ironic when introspection is not active
 * Whitelist all mac's no longer in Ironic when introspection is active
 * Whitelist all mac's no longer in Ironic when node_not_found_hook is set

Comment 11 mlammon 2018-08-08 14:29:17 UTC
Installed latest osp13 (2018-08-07.4)

1. Import nodes in undercloud and introspect them
2. Delete one or more nodes from ironic.
3. List files under /var/lib/ironic-inspector/dhcp-hostsdir/

openstack baremetal introspection list
+--------------------------------------+---------------------+---------------------+-------+
| UUID                                 | Started at          | Finished at         | Error |
+--------------------------------------+---------------------+---------------------+-------+
| 766d0d26-ddf7-4285-be37-e066fee4f019 | 2018-08-07T23:32:56 | 2018-08-07T23:35:03 | None  |
| c3ded92e-2b31-4f89-aeb3-1c9c65835a33 | 2018-08-07T23:32:55 | 2018-08-07T23:34:59 | None  |
| 89bd7034-8e63-4391-abad-cad736741417 | 2018-08-07T23:32:55 | 2018-08-07T23:34:54 | None  |
| b3c98711-a4f3-4c43-b5a2-c11556296d25 | 2018-08-07T23:32:54 | 2018-08-07T23:34:49 | None  |
| f9348f4d-126f-4246-a076-367f301ab56f | 2018-08-07T23:32:53 | 2018-08-07T23:34:44 | None  |
| 68457ac3-c502-4539-8ae4-5180d3696ae0 | 2018-08-07T23:32:52 | 2018-08-07T23:34:38 | None  |
+--------------------------------------+---------------------+---------------------+-------+

deleted all nodes using openstack baremetal node delete <uuid>
 


I still see all the mac files as reported.  failedqa
ll /var/lib/ironic-inspector/dhcp-hostsdir/
total 36
-rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug  8 10:11 00:25:b5:02:a1:2f
-rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug  8 10:11 00:25:b5:02:a1:4f
-rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug  7 19:34 52:54:00:23:23:a5
-rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug  7 19:35 52:54:00:38:79:1b
-rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug  7 19:34 52:54:00:83:43:2d
-rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug  7 19:34 52:54:00:89:f1:e8
-rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug  7 19:35 52:54:00:d6:6b:1f
-rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug  7 19:34 52:54:00:f4:52:a0
-rw-r--r--. 1 ironic-inspector ironic-inspector 19 Aug  7 19:35 unknown_hosts_filter

Comment 12 Bob Fournier 2018-08-08 14:45:21 UTC
Note I believe that the fix for this bug is not to remove the entries under /var/lib/ironic-inspector/dhcp-hostsdir/ - they will stay, but to properly handle nodes that are removed and re-added.  I will let Harald comment, but I believe this should be retested.

Comment 13 Harald Jensås 2018-08-13 10:13:01 UTC
Bob is correct. We do not delete the files from dhcp-hostsdir. (Deleting the files would not cause the actual dhcp configuration in dnsmasq to change.)

What we do is blacklist the mac's of deleted nodes, unless introspection is active or discovery is enabled.

I belive the correct steps to test this is to:

1. Import nodes in undercloud and introspect them
2. Delete one or more nodes from ironic.
3. Boot one or more of the deleted nodes, and ensure they do not boot the inspection image.

Additionally:

4. Enable discovery
5. Ensure one of the deleted nodes can be discoverd

and:

6. Re-Import one or more of the deleted nodes
7. Ensure the node is successfully inspected again.

Comment 14 Bob Fournier 2018-08-13 11:26:43 UTC
Moving this back to ON_QA so it can be retested per Harald's Comment 13.

Comment 15 Joanne O'Flynn 2018-08-15 13:49:53 UTC
This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible.

If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-".


To add draft documentation text:

* Select the documentation type from the "Doc Type" drop down field.

* A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field.

Comment 16 Alexander Chuzhoy 2018-08-16 13:47:57 UTC
FailedQA
Environment:
openstack-ironic-inspector-7.2.1-2.el7ost.noarch


After adding 'enable_node_discovery = true' to undercloud.conf and re-running 'openstack undercloud install', inspectors dnsmasq service was not restarted.

auto discovery doesn't work.

Comment 17 Harald Jensås 2018-08-16 13:52:35 UTC
Essentially we hit:
  https://storyboard.openstack.org/#!/story/2002818

By default ther start/stop command is not configured.
But the option to purge the directory on start/stop of ironic inspector is.

Workarounds:

Option A: Set the start/stop commands
Option B: Set the purge option to False


We should change the defaults deployed for both undercloud and overcloud.



Note: Upstream story https://storyboard.openstack.org/#!/story/2002819 is also related. (If we had that we could have purged/deleted files immidiatly when deleting nodes.)

Comment 18 Harald Jensås 2018-08-16 14:40:44 UTC
Looks like we also need to set some sudo rules ...

Stderr: u'/usr/bin/ironic-inspector-rootwrap: Unauthorized command: systemctl start openstack-ironic-inspector-dnsmasq.service (no filter matched)\n'                                         │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Traceback (most recent call last):                                                                                                       │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector   File "/usr/bin/ironic-inspector", line 10, in <module>                                                                                 │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector     sys.exit(main())                                                                                                                     │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector   File "/usr/lib/python2.7/site-packages/ironic_inspector/cmd/all.py", line 26, in main                                                  │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector     server.run()                                                                                                                         │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector   File "/usr/lib/python2.7/site-packages/ironic_inspector/wsgi_service.py", line 185, in run                                             │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector     self._init_host()                                                                                                                    │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector   File "/usr/lib/python2.7/site-packages/ironic_inspector/wsgi_service.py", line 120, in _init_host                                      │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector     driver.init_filter()                                                                                                                 │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector   File "/usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/base.py", line 81, in inner                                         │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector     return method(self, *args, **kwargs)                                                                                                 │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector   File "/usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/dnsmasq.py", line 141, in init_filter                               │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector     _execute(CONF.dnsmasq_pxe_filter.dnsmasq_start_command)                                                                              │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector   File "/usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/dnsmasq.py", line 313, in _execute                                  │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector     check_exit_code=not ignore_errors)                                                                                                   │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector   File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 424, in execute                                         │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector     cmd=sanitized_cmd)                                                                                                                   │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector ProcessExecutionError: Unexpected error while running command.                                                                           │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Command: sudo ironic-inspector-rootwrap /etc/ironic-inspector/rootwrap.conf systemctl start openstack-ironic-inspector-dnsmasq.service   │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Exit code: 99                                                                                                                            │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Stdout: u''                                                                                                                              │····················
2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Stderr: u'/usr/bin/ironic-inspector-rootwrap: Unauthorized command: systemctl start openstack-ironic-inspector-dnsmasq.service (no filter│····················
 matched)\n'

Comment 20 Bob Fournier 2018-09-27 15:32:39 UTC
Looks 2nd set of patches have merged to stable/queens for both instack-undercloud and ironic-inspector, moving this back to POST.

Comment 27 mlammon 2018-11-05 17:34:04 UTC
(undercloud) [stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed
13   -p 2018-10-24.1

Performed all steps in comment#13 and successfully can verify bug. Did not see any introspection image
after deletion with all variations.  

Environment:
python-ironic-lib-2.12.1-2.el7ost.noarch
openstack-ironic-api-10.1.6-1.el7ost.noarch
openstack-ironic-inspector-7.2.1-4.el7ost.noarch
python2-ironicclient-2.2.1-1.el7ost.noarch
puppet-ironic-12.4.0-3.el7ost.noarch
openstack-ironic-common-10.1.6-1.el7ost.noarch
openstack-ironic-staging-drivers-0.9.1-1.el7ost.noarch
python-ironic-inspector-client-3.1.1-1.el7ost.noarch
python2-ironic-neutron-agent-1.0.0-1.el7ost.noarch
openstack-ironic-conductor-10.1.6-1.el7ost.noarch

Comment 31 errata-xmlrpc 2018-11-13 22:14:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3605


Note You need to log in before you can comment on or make changes to this bug.