Bug 1575782
Summary: | deleted node kept booting the introspection image | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Alexander Chuzhoy <sasha> | |
Component: | openstack-ironic | Assignee: | Harald Jensås <hjensas> | |
Status: | CLOSED ERRATA | QA Contact: | mlammon | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 13.0 (Queens) | CC: | bfournie, hjensas, joflynn, jschluet, mburns, slinaber, srevivo | |
Target Milestone: | z3 | Keywords: | Rebase, Triaged, ZStream | |
Target Release: | 13.0 (Queens) | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | openstack-ironic-10.1.6-1.el7ost | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1633746 (view as bug list) | Environment: | ||
Last Closed: | 2018-11-13 22:14:54 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1633746 |
Description
Alexander Chuzhoy
2018-05-07 23:45:21 UTC
I have abandoned the proposed upstream fix, as it does not work. Abandoned *facepalm* Host records are only added dynamically. We would have to SIGHUP the process to capture the removal of an entry. Since Ironic-Inspector does not spawn the dnsmasq service doing SIGHUP to reload does not seem feasable. I think we have to close this with WONTFIX or CANTFIX? The workaround is to purge the dhcp-hostsdir and reload the dnsmasq service. So the scenario: - We add a node to ironic and introspects it. - The baremtal port (MAC) is created for the node. - the MAC is added to the blacklist - We remove the node from ironic ISSUE: we cannot simply delete or trunkate the filter file. Because dnsmasq only adds records dynamically. And it does not keep any state for which record is in each file. So trunkating the file would leave the filter in dnsmasq until it is reloaded/restarted. Options: a) We keep the nodes blacklisted. Pro: Removed nodes would not boot intrsopection image. Con: Adding the mac address would be required if re-enrolling the node in ironic. It is supported not to add the mac address, and have that discovered during introspection instead. b) We whitelist nodes not in ironic anymore. (*Curent behaviour*) Pro: The node can be re-enrolled and introspected. Con: Any that was enrolled and later removed from Ironic will boot the inspection image. Neiter a) nor b) are good solutions, so we need to come up with option c) that actually works. I have restored the proposed patch upstream and re-worked it. * Blacklist all mac's no longer in Ironic when introspection is not active * Whitelist all mac's no longer in Ironic when introspection is active * Whitelist all mac's no longer in Ironic when node_not_found_hook is set Installed latest osp13 (2018-08-07.4) 1. Import nodes in undercloud and introspect them 2. Delete one or more nodes from ironic. 3. List files under /var/lib/ironic-inspector/dhcp-hostsdir/ openstack baremetal introspection list +--------------------------------------+---------------------+---------------------+-------+ | UUID | Started at | Finished at | Error | +--------------------------------------+---------------------+---------------------+-------+ | 766d0d26-ddf7-4285-be37-e066fee4f019 | 2018-08-07T23:32:56 | 2018-08-07T23:35:03 | None | | c3ded92e-2b31-4f89-aeb3-1c9c65835a33 | 2018-08-07T23:32:55 | 2018-08-07T23:34:59 | None | | 89bd7034-8e63-4391-abad-cad736741417 | 2018-08-07T23:32:55 | 2018-08-07T23:34:54 | None | | b3c98711-a4f3-4c43-b5a2-c11556296d25 | 2018-08-07T23:32:54 | 2018-08-07T23:34:49 | None | | f9348f4d-126f-4246-a076-367f301ab56f | 2018-08-07T23:32:53 | 2018-08-07T23:34:44 | None | | 68457ac3-c502-4539-8ae4-5180d3696ae0 | 2018-08-07T23:32:52 | 2018-08-07T23:34:38 | None | +--------------------------------------+---------------------+---------------------+-------+ deleted all nodes using openstack baremetal node delete <uuid> I still see all the mac files as reported. failedqa ll /var/lib/ironic-inspector/dhcp-hostsdir/ total 36 -rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug 8 10:11 00:25:b5:02:a1:2f -rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug 8 10:11 00:25:b5:02:a1:4f -rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug 7 19:34 52:54:00:23:23:a5 -rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug 7 19:35 52:54:00:38:79:1b -rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug 7 19:34 52:54:00:83:43:2d -rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug 7 19:34 52:54:00:89:f1:e8 -rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug 7 19:35 52:54:00:d6:6b:1f -rw-r--r--. 1 ironic-inspector ironic-inspector 25 Aug 7 19:34 52:54:00:f4:52:a0 -rw-r--r--. 1 ironic-inspector ironic-inspector 19 Aug 7 19:35 unknown_hosts_filter Note I believe that the fix for this bug is not to remove the entries under /var/lib/ironic-inspector/dhcp-hostsdir/ - they will stay, but to properly handle nodes that are removed and re-added. I will let Harald comment, but I believe this should be retested. Bob is correct. We do not delete the files from dhcp-hostsdir. (Deleting the files would not cause the actual dhcp configuration in dnsmasq to change.) What we do is blacklist the mac's of deleted nodes, unless introspection is active or discovery is enabled. I belive the correct steps to test this is to: 1. Import nodes in undercloud and introspect them 2. Delete one or more nodes from ironic. 3. Boot one or more of the deleted nodes, and ensure they do not boot the inspection image. Additionally: 4. Enable discovery 5. Ensure one of the deleted nodes can be discoverd and: 6. Re-Import one or more of the deleted nodes 7. Ensure the node is successfully inspected again. Moving this back to ON_QA so it can be retested per Harald's Comment 13. This bug is marked for inclusion in the errata but does not currently contain draft documentation text. To ensure the timely release of this advisory please provide draft documentation text for this bug as soon as possible. If you do not think this bug requires errata documentation, set the requires_doc_text flag to "-". To add draft documentation text: * Select the documentation type from the "Doc Type" drop down field. * A template will be provided in the "Doc Text" field based on the "Doc Type" value selected. Enter draft text in the "Doc Text" field. FailedQA Environment: openstack-ironic-inspector-7.2.1-2.el7ost.noarch After adding 'enable_node_discovery = true' to undercloud.conf and re-running 'openstack undercloud install', inspectors dnsmasq service was not restarted. auto discovery doesn't work. Essentially we hit: https://storyboard.openstack.org/#!/story/2002818 By default ther start/stop command is not configured. But the option to purge the directory on start/stop of ironic inspector is. Workarounds: Option A: Set the start/stop commands Option B: Set the purge option to False We should change the defaults deployed for both undercloud and overcloud. Note: Upstream story https://storyboard.openstack.org/#!/story/2002819 is also related. (If we had that we could have purged/deleted files immidiatly when deleting nodes.) Looks like we also need to set some sudo rules ... Stderr: u'/usr/bin/ironic-inspector-rootwrap: Unauthorized command: systemctl start openstack-ironic-inspector-dnsmasq.service (no filter matched)\n' │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Traceback (most recent call last): │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector File "/usr/bin/ironic-inspector", line 10, in <module> │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector sys.exit(main()) │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector File "/usr/lib/python2.7/site-packages/ironic_inspector/cmd/all.py", line 26, in main │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector server.run() │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector File "/usr/lib/python2.7/site-packages/ironic_inspector/wsgi_service.py", line 185, in run │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector self._init_host() │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector File "/usr/lib/python2.7/site-packages/ironic_inspector/wsgi_service.py", line 120, in _init_host │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector driver.init_filter() │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector File "/usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/base.py", line 81, in inner │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector return method(self, *args, **kwargs) │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector File "/usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/dnsmasq.py", line 141, in init_filter │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector _execute(CONF.dnsmasq_pxe_filter.dnsmasq_start_command) │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector File "/usr/lib/python2.7/site-packages/ironic_inspector/pxe_filter/dnsmasq.py", line 313, in _execute │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector check_exit_code=not ignore_errors) │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector File "/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py", line 424, in execute │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector cmd=sanitized_cmd) │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector ProcessExecutionError: Unexpected error while running command. │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Command: sudo ironic-inspector-rootwrap /etc/ironic-inspector/rootwrap.conf systemctl start openstack-ironic-inspector-dnsmasq.service │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Exit code: 99 │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Stdout: u'' │···················· 2018-08-16 10:39:27.204 25148 ERROR ironic_inspector Stderr: u'/usr/bin/ironic-inspector-rootwrap: Unauthorized command: systemctl start openstack-ironic-inspector-dnsmasq.service (no filter│···················· matched)\n' Looks 2nd set of patches have merged to stable/queens for both instack-undercloud and ironic-inspector, moving this back to POST. (undercloud) [stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 13 -p 2018-10-24.1 Performed all steps in comment#13 and successfully can verify bug. Did not see any introspection image after deletion with all variations. Environment: python-ironic-lib-2.12.1-2.el7ost.noarch openstack-ironic-api-10.1.6-1.el7ost.noarch openstack-ironic-inspector-7.2.1-4.el7ost.noarch python2-ironicclient-2.2.1-1.el7ost.noarch puppet-ironic-12.4.0-3.el7ost.noarch openstack-ironic-common-10.1.6-1.el7ost.noarch openstack-ironic-staging-drivers-0.9.1-1.el7ost.noarch python-ironic-inspector-client-3.1.1-1.el7ost.noarch python2-ironic-neutron-agent-1.0.0-1.el7ost.noarch openstack-ironic-conductor-10.1.6-1.el7ost.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3605 |