Bug 1051028
Summary: | neutron-dhcp-agent doesn't clean after itself when service is shut down | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Miguel Angel Ajo <majopela> | |
Component: | openstack-neutron | Assignee: | Miguel Angel Ajo <mangelajo> | |
Status: | CLOSED ERRATA | QA Contact: | Ofer Blaut <oblaut> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.0 | CC: | breeler, chrisw, fdinitto, lpeer, majopela, yeylon | |
Target Milestone: | z4 | Keywords: | ZStream | |
Target Release: | 4.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | openstack-neutron-2013.2.2-9.el6ost | Doc Type: | Bug Fix | |
Doc Text: |
Cause: the neutron dhcp agent is known not to clean up resources (netns, dnsmasq processes, etc.) when the service is stopped. This is a feature intended to allow upgrades to the agent without service disruption.
Consequence: When trying remove a node from the cluster, and stop the services, the dhcp services/resources will remain active, but will get updated as soon as there are changes to the served tenant networks.
Fix: Added the neutron-netns-cleanup init script to allow cleanup of the dhcp service resources as needed.
Result: The resources can be cleaned up now by running the script.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1051036 (view as bug list) | Environment: | ||
Last Closed: | 2014-05-29 20:18:15 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1062685, 1173435 | |||
Bug Blocks: | 1051036, 1080561 |
Description
Miguel Angel Ajo
2014-01-09 15:10:20 UTC
This is related, it seems that the settings are left by-design, and we should use neutron-netns-cleanup in the neutron-l3-agent init.d script at stop, or right after stop of l3-agent. But It has parameters, I'm checking it. https://bugs.launchpad.net/neutron/+bug/1115999 Launchpad bug #1115999 prevents from properly cleaning the metadata-proxies in namespaces (qdhcp or qrouter), that needs to be fixed to have a workaround here. Launchpad bug#1273095 prevents from properly selecting which kind of namespace we want to cleanup (dhcp or l3-agent). It seems that the neutron-netns-cleanup is broken 1) it doesn't have an /etc/init.d/neutron-netns-cleanup script as ovs has 2) it fails on invocation # neutron-netns-cleanup --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/dhcp_agent.ini --debug --force 2014-02-03 11:31:36.046 4848 INFO neutron.common.config [-] Logging enabled! 2014-02-03 11:31:36.046 4848 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'list'] execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:43 2014-02-03 11:31:36.193 4848 DEBUG neutron.agent.linux.utils [-] Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'list'] Exit code: 0 Stdout: 'qdhcp-f742e733-672e-4e76-8003-034185564a90\nqdhcp-b38c071a-508d-44bb-8359-c2e694bb6f9b\nqdhcp-7bfe4972-8c59-4f0b-8634-3420acd75844\nqdhcp-e1fde457-5fb8-4965-b64b-248da99f7a8d\nqrouter-7c82790b-e1e6-4451-a772-fb4f39117d5a\n' Stderr: '' execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:60 Error importing interface driver 'neutron.agent.linux.interface.OVSInterfaceDriver': no such option: ovs_use_veth There is another bug in iproute rhbz#1062685 that prevents netns deletion from working. It has a fix, and it's tested. using /etc/init.d/neutron-netns-forced-cleanup start cleans up the network namespaces and all internal iptable rules + interfaces, the fix up is provided in this repo: http://file.rdu.redhat.com/~majopela/neutron-ha-fixes-bz-1051028-and-36-cleanup/ neutron needs to be patched (netns_cleanup script). for patches & scripts, please refer to https://bugzilla.redhat.com/attachment.cgi?bugid=1051036 When the neutron-netns-cleanup init.d script is installed (via pacemaker, or via normal init.d script installation) it will, during startup, clean up any empty namespaces (with no resources inside: processes, ports, etc), and when stopped it will clean up all resources forced. Stop should happen in three conditions: 1) When the node is set off the cluster 2) When the neutron-agent resources are took off the node. 3) If the neutron-netns-cleanup script is installed as a service it will clean up all netns namespaces during reboot/poweroff/halt or leaving the programmed runlevels. I have tested that when service neutron-netns-cleanup stop used netns are cleaned The stop conditions are HA related and not script one. openstack-neutron-2013.2.3-4.el6ost.noarch [root@puma05 ~]# ip netns qdhcp-a76e98a5-7ae3-4f91-b721-4f81cebcfa6f qdhcp-6dcaa203-e61a-4003-a1fe-95d60853516f qrouter-15ef1247-b52a-43fc-bfa2-27478dbfe1f3 [root@puma05 ~]# service neutron-netns-cleanup stop [root@puma05 ~]# ip netns [root@puma05 ~]# [root@puma05 ~]# [root@puma05 ~]# service neutron-netns-cleanup start [root@puma05 ~]# ip netns [root@puma05 ~]# openstack-status == neutron services == neutron-server: inactive (disabled on boot) neutron-dhcp-agent: active neutron-l3-agent: active neutron-metadata-agent: active neutron-lbaas-agent: inactive (disabled on boot) neutron-openvswitch-agent: active == Support services == openvswitch: active messagebus: active [root@puma05 ~]# service neutron-dhcp-agent restart Stopping neutron-dhcp-agent: [ OK ] Starting neutron-dhcp-agent: [ OK ] [root@puma05 ~]# service neutron-l3-agent restart Stopping neutron-l3-agent: [ OK ] Starting neutron-l3-agent: [ OK ] [root@puma05 ~]# ip netns qdhcp-a76e98a5-7ae3-4f91-b721-4f81cebcfa6f qdhcp-6dcaa203-e61a-4003-a1fe-95d60853516f [root@puma05 ~]# ip netns qdhcp-a76e98a5-7ae3-4f91-b721-4f81cebcfa6f qdhcp-6dcaa203-e61a-4003-a1fe-95d60853516f qrouter-15ef1247-b52a-43fc-bfa2-27478dbfe1f3 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-0516.html |