Bug 1103461
| Summary: | When deleting router/networks, netns should be deleted by default. | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Toni Freger <tfreger> |
| Component: | kernel | Assignee: | Denys Vlasenko <dvlasenk> |
| Status: | CLOSED WORKSFORME | QA Contact: | Ofer Blaut <oblaut> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 5.0 (RHEL 7) | CC: | amuller, chrisw, dvlasenk, jliberma, jshortt, kimi.zhang, lhh, lwang, majopela, mangelajo, mschuppe, nyechiel, oblaut, psimerda, rhos-maint, rkhan, rsussman, sputhenp, tfreger, yeylon |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | 5.0 (RHEL 7) | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-10-14 18:58:19 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 743661, 1198800 | ||
|
Description
Toni Freger
2014-06-01 05:01:13 UTC
openstack-netns-cleanup is not currently packaged as a service (ova-cleanup is though). Sorry, ovs-cleanup, not ova-cleanup. @Miguel: I thought we use netns-cleanup for our downstream HA solutions? Directly via the CLI then? Actually, scratch that, Ihar confused me :) We don't need netns-cleanup, we just need to see if everything works properly when changing these two configuration values: l3_agent.ini: router_delete_namespaces = True dhcp_agent.ini: dhcp_delete_namespaces = True @Ofer: Instead of waiting for a downstream only patch that changes these two values to True by default, I think the best course of action is for QA/you to change these two values on the network node in your deployment and seeing if everything still works. If so, I'll send the patch. Note that I suspect that RHELOSP 5 on RHEL 6.5 will fail the verification, and RHELOSP 5 on RHEL 7 will pass! Could I ask you to test the proposed change on both platforms? It would influence the branch(es) that we'd push the patch to. RHOS5 on RHEL7 the bug still reproduces, even though l3_agent and dhcp_agent are set to True.
The error trace is attached:
var/log/neutron/l3_agent.log
2014-06-10 14:05:33.291 23580 ERROR neutron.agent.l3_agent [-] Failed trying to delete namespace: qrouter-4937b089-5a8a-4766-a6e6-86bac1554d07
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent Traceback (most recent call last):
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/site-packages/neutron/agent/l3_agent.py", line 301, in _destroy_router_namespace
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent ns_ip.netns.delete(namespace)
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 450, in delete
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent self._as_root('delete', name, use_root_namespace=True)
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 217, in _as_root
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent kwargs.get('use_root_namespace', False))
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 70, in _as_root
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent namespace)
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 81, in _execute
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent root_helper=root_helper)
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 76, in execute
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent raise RuntimeError(m)
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent RuntimeError:
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'delete', 'qrouter-4937b089-5a8a-4766-a6e6-86bac1554d07']
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent Exit code: 1
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent Stdout: ''
2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent Stderr: 'Cannot remove namespace file "/var/run/netns/qrouter-4937b089-5a8a-4766-a6e6-86bac1554d07": Device or resource busy\n'
Ok, this error 2014-06-10 14:05:33.291 23580 TRACE neutron.agent.l3_agent Stderr: 'Cannot remove namespace file "/var/run/netns/qrouter-4937b089-5a8a-4766-a6e6-86bac1554d07": Device or resource busy\n' Means that we're missing any of the netns patches in rhel7 kernel or iproute. I believe this is not neutron specific, but iproute, or a kernel backport missing. *** Bug 1060340 has been marked as a duplicate of this bug. *** We are still seeing "Device or resource busy" sometimes when trying to delete a namespace, we suspect of iproute itself or the kernel for any missing netns related fix. What conditions (else than failures) could trigger a "Device or resource busy" for an ip netns delete? The iproute tool is very simple and any error like that is almost certainly coming from the kernel. Hi, Any updates regarding this bug? Thanks, Toni Should be happening when a "ip netns delete NAME" is run. iproute2 code which handles this operation is straightforward:
static int on_netns_del(char *nsname, void *arg)
{
char netns_path[PATH_MAX];
snprintf(netns_path, sizeof(netns_path), "%s/%s", NETNS_RUN_DIR, nsname);
umount2(netns_path, MNT_DETACH);
if (unlink(netns_path) < 0) {
fprintf(stderr, "Cannot remove namespace file \"%s\": %s\n",
netns_path, strerror(errno));
return -1;
}
return 0;
}
So, indeed, "The iproute tool is very simple and any error like that is almost certainly coming from the kernel" is true.
Ok. Closing this as WORKSFORME per Assaf suggestion. Please reopen if you have a reproducer. Please specify exact kernel version. |