Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1095015

Summary: "seting the network namespace failed: Invalid argument" from ip netns commands
Product: Red Hat OpenStack Reporter: Ian Wienand <iwienand>
Component: iprouteAssignee: Phil Sutter <psutter>
Status: CLOSED WORKSFORME QA Contact: Ofer Blaut <oblaut>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: aortega, iwienand, kzhang, lhh, lwang, majopela, rhos-maint, rkhan, sclewis, wfoster, yeylon
Target Milestone: ---Keywords: ZStream
Target Release: 5.0 (RHEL 7)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1320578 (view as bug list) Environment:
Last Closed: 2016-03-31 10:14:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1320578    

Description Ian Wienand 2014-05-07 00:31:01 UTC
Description of problem:

neutron started to misbehave quite badly today in oslab.  one of the symptoms was an seemin inability to work with network name-spaces; for example the l3-agent.log was filling up with things like

---
2014-05-06 21:17:44.387 18955 ERROR neutron.agent.l3_agent [-] Failed synchronizing routers
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent Traceback (most recent call last):
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent   File "/usr/lib/python2.6/site-packages/neutron/agent/l3_agent.py", line 765, in _sync_routers_task
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent     self._process_routers(routers, all_routers=True)
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent   File "/usr/lib/python2.6/site-packages/neutron/agent/l3_agent.py", line 708, in _process_routers
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent     self._router_added(r['id'], r)
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent   File "/usr/lib/python2.6/site-packages/neutron/agent/l3_agent.py", line 337, in _router_added
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent     self._create_router_namespace(ri)
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent   File "/usr/lib/python2.6/site-packages/neutron/agent/l3_agent.py", line 307, in _create_router_namespace
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent     ip_wrapper.netns.execute(['sysctl', '-w', 'net.ipv4.ip_forward=1'])
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent   File "/usr/lib/python2.6/site-packages/neutron/agent/linux/ip_lib.py", line 467, in execute
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent     check_exit_code=check_exit_code)
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent   File "/usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py", line 75, in execute
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent     raise RuntimeError(m)
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent RuntimeError: 
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent Command: ['sudo', 'ip', 'netns', 'exec', 'qrouter-857330af-e8c1-4cc5-b900-086596210244', 'sysctl', '-w', 'net.ipv4.ip_forward=1']
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent Exit code: 255
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent Stdout: ''
2014-05-06 21:17:44.387 18955 TRACE neutron.agent.l3_agent Stderr: 'seting the network namespace failed: Invalid argument\n'
---

looking a little closer at one network, it exists but can't be used

---
[root@host03 log]# ip netns list | grep dhcp-20803109-5bea-4638-8b02-bcdbace3f0b2 
qdhcp-20803109-5bea-4638-8b02-bcdbace3f0b2

[root@host03 log]#  /sbin/ip netns exec qdhcp-20803109-5bea-4638-8b02-bcdbace3f0b2 ip -o link show 
seting the network namespace failed: Invalid argument
---

strace of this shows

---
[root@host03 log]# strace  /sbin/ip netns exec qdhcp-20803109-5bea-4638-8b02-bcdbace3f0b2 ip -o link show 
execve("/sbin/ip", ["/sbin/ip", "netns", "exec", "qdhcp-20803109-5bea-4638-8b02-bc"..., "ip", "-o", "link", "show"], [/* 25 vars */]) = 0
open("/var/run/netns/qdhcp-20803109-5bea-4638-8b02-bcdbace3f0b2", O_RDONLY) = 4
syscall_308(0x4, 0x40000000, 0xffffffffffffffff, 0, 0x622d323062382d38, 0x7fff2bbc1771, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de, 0x42b7de) = -1 (errno 22)
write(2, "seting the network namespace fai"..., 54seting the network namespace failed: Invalid argument
) = 54
exit_group(-1)                          = ?
---

unfortunately strace in rhel isn't built against openstack kernel so it doesn't know about netns call, but we can see it's calling netns("/var/run/netns/qdhcp-20803109-5bea-4638-8b02-bcdbace3f0b2", CLONE_NEWNET) which seems about right.

EINVAL looks like it could come from two places; proc_ns_fget() or the type check in setns()

---
struct file *proc_ns_fget(int fd)
{
        struct file *file;

        file = fget(fd);
        if (!file)
                return ERR_PTR(-EBADF);

        if (file->f_op != &ns_file_operations)
                goto out_invalid;

        return file;

out_invalid:
        fput(file);
        return ERR_PTR(-EINVAL);
}

...

SYSCALL_DEFINE2(setns, int, fd, int, nstype)
{
        const struct proc_ns_operations *ops;
        struct task_struct *tsk = current;
        struct nsproxy *new_nsproxy;
        struct proc_inode *ei;
        struct file *file;
        int err;

        if (!capable(CAP_SYS_ADMIN))
                return -EPERM;

        file = proc_ns_fget(fd);
        if (IS_ERR(file))
                return PTR_ERR(file);

        err = -EINVAL;
        ei = PROC_I(file->f_dentry->d_inode);
        ops = ei->ns_ops;
        if (nstype && (ops->type != nstype))
                goto out;

...
}
---

The permissions on the file at the time of the problem were blank

---
[root@host03 netns]# ls -l /var/run/netns/qdhcp-20803109-5bea-4638-8b02-bcdbace3f0b2
----------. 1 root root 0 Mar 27 03:03 /var/run/netns/qdhcp-20803109-5bea-4638-8b02-bcdbace3f0b2
---

after the system was rebooted, it came back to 666

---
[root@host03 netns]# ls -l /var/run/netns/qdhcp-20803109-5bea-4638-8b02-bcdbace3f0b2
-r--r--r--. 1 root root 0 May  6 21:25 /var/run/netns/qdhcp-20803109-5bea-4638-8b02-bcdbace3f0b2
---
Does this suggest the namespace somehow became detached?

wfoster and myself went through the logs pretty carefully, there is nothing to suggest what would have changed the permissions, or how.  The only network related thing in dmesg (which unfortunately isn't timestamped on this system) was

---
lo: Disabled Privacy Extensions
__ratelimit: 376 callbacks suppressed
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
tapa3f71897-7b: no IPv6 routers present
qr-7dbfe943-f4: no IPv6 routers present
tap6b31d415-c7: no IPv6 routers present
device qg-56760887-0c entered promiscuous mode
__ratelimit: 275 callbacks suppressed
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
Neighbour table overflow.
tap312416e7-51: no IPv6 routers present
qg-56760887-0c: no IPv6 routers present
---

versions:

---
[root@host03 netns]# rpm -qa | grep neutron
openstack-neutron-2013.2.3-7.el6ost.noarch
python-neutron-2013.2.3-7.el6ost.noarch
openstack-neutron-openvswitch-2013.2.3-7.el6ost.noarch
python-neutronclient-2.3.4-1.el6ost.noarch

[root@host03 netns]# rpm -qa | grep kernel
kernel-headers-2.6.32-431.11.2.el6.x86_64
kernel-2.6.32-431.11.2.el6.x86_64
libreport-plugin-kerneloops-2.0.9-19.el6.x86_64
dracut-kernel-004-336.el6_5.2.noarch
kernel-2.6.32-431.3.1.el6.x86_64
kernel-2.6.32-431.5.1.el6.x86_64
abrt-addon-kerneloops-2.0.8-21.el6.x86_64
kernel-firmware-2.6.32-431.11.2.el6.noarch
kernel-devel-2.6.32-431.11.2.el6.x86_64
---

Comment 1 Ian Wienand 2014-05-07 00:37:33 UTC
Another data point from the neutron side ... all of the TAP devices on the neutron server had disappeared (not showing in ip link).  However, all of the dnsmasq processes for the various networks were still running and listening on these non-existant tap devices.

Comment 2 Ian Wienand 2014-05-07 04:53:07 UTC
Ok, I managed to recreate this problem by deleting an attached ns

e.g., in one window do

---
[root@rhel ~]# ip netns add testing
[root@rhel ~]# ls -l /var/run/netns/testing 
-r--r--r--. 1 root root 0 May  7 14:44 /var/run/netns/testing
[root@rhel ~]# ip netns exec testing bash -c "while [ 1 ]; do echo "hi"; sleep 5; done"
---

in another window, remove it

---
[root@rhel ~]# ip netns del testing
Cannot remove /var/run/netns/testing: Device or resource busy
[root@rhel ~]# ls -l /var/run/netns/testing 
----------. 1 root root 0 May  7 14:44 /var/run/netns/testing
---

note it has disappeared from /proc/mounts

---
[root@rhel ~]# cat /proc/mounts 
rootfs / rootfs rw 0 0
proc /proc proc rw,relatime 0 0
sysfs /sys sysfs rw,seclabel,relatime 0 0
devtmpfs /dev devtmpfs rw,seclabel,relatime,size=951024k,nr_inodes=237756,mode=755 0 0
devpts /dev/pts devpts rw,seclabel,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /dev/shm tmpfs rw,seclabel,relatime 0 0
/dev/mapper/vg_rhel-lv_root / ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
none /selinux selinuxfs rw,relatime 0 0
devtmpfs /dev devtmpfs rw,seclabel,relatime,size=951024k,nr_inodes=237756,mode=755 0 0
/proc/bus/usb /proc/bus/usb usbfs rw,relatime 0 0
/dev/sda1 /boot ext4 rw,seclabel,relatime,barrier=1,data=ordered 0 0
none /proc/sys/fs/binfmt_misc binfmt_misc rw,relatime 0 0
cgroup /cgroup/cpuset cgroup rw,relatime,cpuset 0 0
cgroup /cgroup/cpu cgroup rw,relatime,cpu 0 0
cgroup /cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
cgroup /cgroup/memory cgroup rw,relatime,memory 0 0
cgroup /cgroup/devices cgroup rw,relatime,devices 0 0
cgroup /cgroup/freezer cgroup rw,relatime,freezer 0 0
cgroup /cgroup/net_cls cgroup rw,relatime,net_cls 0 0
cgroup /cgroup/blkio cgroup rw,relatime,blkio 0 0
/etc/auto.misc /misc autofs rw,relatime,fd=7,pgrp=1314,timeout=300,minproto=5,maxproto=5,indirect 0 0
-hosts /net autofs rw,relatime,fd=13,pgrp=1314,timeout=300,minproto=5,maxproto=5,indirect 0 0
---

looking at the patch for iproute netns support


---
+static int netns_delete(int argc, char **argv)
+{
+       const char *name;
+       char netns_path[MAXPATHLEN];
+
+       if (argc < 1) {
+               fprintf(stderr, "No netns name specified\n");
+               return -1;
+       }
+
+       name = argv[0];
+       snprintf(netns_path, sizeof(netns_path), "%s/%s", NETNS_RUN_DIR, name);
+       umount2(netns_path, MNT_DETACH);
+       if (unlink(netns_path) < 0) {
+               fprintf(stderr, "Cannot remove %s: %s\n",
+                       netns_path, strerror(errno));
+               return -1;
+       }
+       return 0;
+}
---

I really think that umount2 call should check it's return before it goes and does the unlink

Comment 3 Ian Wienand 2014-05-07 05:10:09 UTC
I see upstream is the same so maybe this highlights an invalid assumption that rhel kernel invalidates [1]?

I couldn't find a lot of discussion, [2] was the only relevant thread

[1] http://git.kernel.org/cgit/linux/kernel/git/shemminger/iproute2.git/tree/ip/ipnetns.c#n377
[2] http://marc.info/?l=linux-netdev&m=137962865905031&w=2

Comment 4 Petr Šabata 2014-05-12 23:42:30 UTC
Hmm, what build are you using?

There was a recent fix related to netns for RHOS4 in bug #1062685.  Could that possibly resolve your issue too?

Comment 5 Ian Wienand 2014-05-13 19:33:06 UTC
Sorry, forgot to paste the route version

---
[root@host03 ~]# rpm -qa | grep iproute
iproute-2.6.32-130.el6ost.netns.3.x86_64
---

so we should have the fix for bug#1062685.

I guess the question is if the behaviour in comment#2 is a bug or a feature...

Comment 6 Pavel Šimerda (pavlix) 2015-04-20 13:15:35 UTC
(In reply to Ian Wienand from comment #5)
> Sorry, forgot to paste the route version
> 
> ---
> [root@host03 ~]# rpm -qa | grep iproute
> iproute-2.6.32-130.el6ost.netns.3.x86_64
> ---
> 
> so we should have the fix for bug#1062685.

So what's the current status of the bug report from your point of view?

> I guess the question is if the behaviour in comment#2 is a bug or a
> feature...

I don't have details on desired behavior of busy network namespaces but I don't think iproute can affect that. If you're pursuing a fix/change in kernel behavior, please switch the bug to kernel or start a new bug report.

Comment 7 Phil Sutter 2016-03-24 14:02:28 UTC
Hi,

To me the behaviour seems intentional. Upon 'ip netns del <name>', the mounted netns at /var/run/netns/<name> is first umounted (using MNT_DETACH flag), then unlinked. Both operations will succeed even if another process still runs inside that namespace. In fact, MNT_DETACH explicitly requests to delay the umount in case the mount point is busy until the last user is gone.

After evaluating a few ways to improve the situation, the only way I see for OpenStack is to behave nicely and kill all PIDs returned by 'ip netns pids <name>' before deleting the namespace.

Sadly, one can't get the list of PIDs after deleting the namespace, so there's still a slight chance for a race condition (if a new process is spawned inside the NS in between killing old ones and deleting it). Though I also couldn't find a way to keep the PIDs list available after NS removal since on RHEL7 at least even a regular umount (without MNT_DETACH) succeeds if there are still processes running inside.

Is this a possible workaround to the observed behaviour?

Thanks, Phil

Comment 8 Ian Wienand 2016-03-30 22:34:52 UTC
Honestly, in the (almost) 2 years since I filed this, Neutron has changed so much I have no idea.

I just tried this on a F23 box and i guess it works as you would expect now ... if you create a ns in one window and run something, then delete it in another, there's nothing left in /var/run/netns but the other process seems to keep running.

Comment 9 Phil Sutter 2016-03-31 10:14:34 UTC
Hi Ian,

(In reply to Ian Wienand from comment #8)
> Honestly, in the (almost) 2 years since I filed this, Neutron has changed so
> much I have no idea.
> 
> I just tried this on a F23 box and i guess it works as you would expect now
> ... if you create a ns in one window and run something, then delete it in
> another, there's nothing left in /var/run/netns but the other process seems
> to keep running.

Thanks for your reply. Assuming the problem is not relevant anymore, I'm closing this ticket now. Feel free to reopen in case you stumble upon it again.

Thanks, Phil