Description of problem: Unable to delete 'active' namespaces via cleanup utility with "force" attribute (active namespace - It has additional devices other than the loopback device) neutron-netns-cleanup --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/dhcp_agent.ini --force Version-Release number of selected component (if applicable): openstack-neutron-2014.1.3-7.el7ost.noarch openstack-neutron-openvswitch-2014.1.3-7.el7ost.noarch python-neutronclient-2.3.4-3.el7ost.noarch python-neutron-2014.1.3-7.el7ost.noarch openstack-neutron-ml2-2014.1.3-7.el7ost.noarch openstack-neutron-metering-agent-2014.1.3-7.el7ost.noarch How reproducible: 100% Actual results: neutron-netns-cleanup --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/dhcp_agent.ini --force 2014-11-09 16:03:04.108 107268 INFO neutron.common.config [-] Logging enabled! 2014-11-09 16:03:06.168 107268 ERROR neutron.agent.netns_cleanup_util [-] Error unable to destroy namespace: qrouter-a66d28c0-04ba-4b7d-9a56-ebdb0c4e0278 2014-11-09 16:03:06.168 107268 TRACE neutron.agent.netns_cleanup_util Traceback (most recent call last): 2014-11-09 16:03:06.168 107268 TRACE neutron.agent.netns_cleanup_util File "/usr/lib/python2.7/site-packages/neutron/agent/netns_cleanup_util.py", line 131, in destroy_namespace 2014-11-09 16:03:06.168 107268 TRACE neutron.agent.netns_cleanup_util kill_dhcp(conf, namespace) 2014-11-09 16:03:06.168 107268 TRACE neutron.agent.netns_cleanup_util File "/usr/lib/python2.7/site-packages/neutron/agent/netns_cleanup_util.py", line 85, in kill_dhcp 2014-11-09 16:03:06.168 107268 TRACE neutron.agent.netns_cleanup_util if dhcp_driver.active: 2014-11-09 16:03:06.168 107268 TRACE neutron.agent.netns_cleanup_util File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 253, in active 2014-11-09 16:03:06.168 107268 TRACE neutron.agent.netns_cleanup_util return self.process_monitor.is_active(self.network.id, 2014-11-09 16:03:06.168 107268 TRACE neutron.agent.netns_cleanup_util AttributeError: 'NoneType' object has no attribute 'is_active' 2014-11-09 16:03:06.168 107268 TRACE neutron.agent.netns_cleanup_util 2014-11-09 16:03:06.169 107268 ERROR neutron.agent.netns_cleanup_util [-] Error unable to destroy namespace: qdhcp-b876a9d3-9c65-4ca2-95e9-6eb35bdc70c0 2014-11-09 16:03:06.169 107268 TRACE neutron.agent.netns_cleanup_util Traceback (most recent call last): 2014-11-09 16:03:06.169 107268 TRACE neutron.agent.netns_cleanup_util File "/usr/lib/python2.7/site-packages/neutron/agent/netns_cleanup_util.py", line 131, in destroy_namespace 2014-11-09 16:03:06.169 107268 TRACE neutron.agent.netns_cleanup_util kill_dhcp(conf, namespace) 2014-11-09 16:03:06.169 107268 TRACE neutron.agent.netns_cleanup_util File "/usr/lib/python2.7/site-packages/neutron/agent/netns_cleanup_util.py", line 85, in kill_dhcp 2014-11-09 16:03:06.169 107268 TRACE neutron.agent.netns_cleanup_util if dhcp_driver.active: 2014-11-09 16:03:06.169 107268 TRACE neutron.agent.netns_cleanup_util File "/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py", line 253, in active 2014-11-09 16:03:06.169 107268 TRACE neutron.agent.netns_cleanup_util return self.process_monitor.is_active(self.network.id, 2014-11-09 16:03:06.169 107268 TRACE neutron.agent.netns_cleanup_util AttributeError: 'NoneType' object has no attribute 'is_active' Expected results: Additional info:
Copying from email: Miguel has backported process monitor patches to RHOSP5, and it seems that cleanup utils were broken. The failure is as follows: DhcpLocalProcess.active() was modified to use process monitor to track process state. For this to work, process monitor should be passed when the class is instantiated. The problem is that cleanup tools were not updated to do so. The problematic part is: neutron/agent/netns_cleanup_util.py dhcp_driver = importutils.import_object( conf.dhcp_driver, conf=conf, network=dhcp.NetModel(conf.use_namespaces, {'id': network_id}), root_helper=root_helper, plugin=FakeDhcpPlugin()) As you can see, there is no process monitor argument appended, hence the failure. I think we should: 1) pass the argument in cleanup tools; 2) make it required (atm it's =None, allowing such failures to sneak). The fix itself is easy, though we may need to make patch reshuffling (or just band aid existing backports).
Fixing upstream kilo. Including the fix downstream as a last patch in the series.
Created attachment 956753 [details] Netns cleanup fix tracking upstream changes in a single patch for later releases.
The fix is not complete, dnsmasqs won't be deleted, I also found the backport broke something else (reload of dnsmasq config file when we restart the agent) filling a new blocker for that.
Tested in Rhel7 AIO + Compute node OpenStack/5.0-RHEL-7/2014-11-14.1 openstack-neutron-2014.1.3-10.el7ost.noarch openstack-neutron-ml2-2014.1.3-10.el7ost.noarch openstack-neutron-metering-agent-2014.1.3-10.el7ost.noarch openstack-neutron-openvswitch-2014.1.3-10.el7ost.noarch These two cleanup options work as expected without errors messages: 1) neutron-netns-cleanup --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/dhcp_agent.ini --force 2) neutron-netns-cleanup --config-file=/etc/neutron/neutron.conf --config-file=/etc/neutron/dhcp_agent.ini
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2014-1942.html