Bug 882047 - "ip netns exec" destroys the /sys mounting and causes systemd problem
"ip netns exec" destroys the /sys mounting and causes systemd problem
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: iproute (Show other bugs)
18
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Petr Šabata
Fedora Extras Quality Assurance
:
: 892927 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-29 20:57 EST by Etsuji Nakai
Modified: 2013-03-21 20:36 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-02-20 16:16:20 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Etsuji Nakai 2012-11-29 20:57:09 EST
Description of problem:
"ip netns exec" destroys the /sys mounting and results in the systemd problem.

Version-Release number of selected component (if applicable):
# rpm -q iproute
iproute-3.6.0-2.fc18.x86_64
# rpm -q systemd
systemd-195-8.fc18.x86_64


How reproducible:
Steps to Reproduce:
1. Check systemctl works well and /sys mounting status.

# systemctl
(No error)
# mount | grep /sys
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime,seclabel)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
selinuxfs on /sys/fs/selinux type selinuxfs (rw,relatime)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,seclabel,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpuacct,cpu)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=29,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
configfs on /sys/kernel/config type configfs (rw,relatime)

2. Execute "ip netns exec"
# ip netns add test; ip netns exec test ip link
5: lo: <LOOPBACK> mtu 16436 qdisc noop state DOWN mode DEFAULT 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

3. /sys mounting status is broken. Many of previous results have gone away.
# mount | grep /sys
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=29,pgrp=1,timeout=300,minproto=5,maxproto=5,direct)
test on /sys type sysfs (rw,relatime,seclabel)

4. As a result systemctl fails with the following error
# systemctl 
Failed to get D-Bus connection: No connection to service manager.
Comment 1 Etsuji Nakai 2012-11-30 20:50:35 EST
By the way, I'm not sure the reason but by "run-and-stop systemd" as below makes systemctl works again.
----
$ systemd
Failed to open private bus connection: Unable to autolaunch a dbus-daemon without a $DISPLAY for X11
(Stop it with Ctrl+C)
----
Comment 2 Etsuji Nakai 2012-12-03 23:45:20 EST
I looked into iproute2's source code.

I'm afraid that the "netns exec" part of "ip" is broken.

iproute2-3.6.0/ip/ipnetns.c.orig
============
    119 static int netns_exec(int argc, char **argv)
    120 {
...
    155         /* Mount a version of /sys that describes the network namespace */
    156         if (umount2("/sys", MNT_DETACH) < 0) {
    157                 fprintf(stderr, "umount of /sys failed: %s\n", strerror(errno));
    158                 return -1;
    159         }
    160         if (mount(name, "/sys", "sysfs", 0, NULL) < 0) {
    161                 fprintf(stderr, "mount of /sys failed: %s\n",strerror(errno));
    162                 return -1;
    163         }
    164 
    165         /* Setup bind mounts for config files in /etc */
    166         bind_etc(name);
    167 
    168         if (execvp(cmd, argv + 1)  < 0)
    169                 fprintf(stderr, "exec of %s failed: %s\n",
    170                         cmd, strerror(errno));
    171         exit(-1);
    172 }
============

1. This remounts "/sys" without any consideration and breaks the original mount tree under /sys.
2. This leaves the remounted /sys containing the child network namespace information.

The second point can be confirmed as below:
--------------
Add "parent" device in the parent namespace.
[root@localhost ~]# ls /sys/devices/virtual/net/
lo
[root@localhost ~]# ip link add parent type dummy
[root@localhost ~]# ls /sys/devices/virtual/net/
lo  parent

Add "child" device in the child (test) namespace.
[root@localhost ~]# ip netns add test
[root@localhost ~]# ip netns exec test ip link add child type dummy

In the parent namesapce, /sys remains as the child's one. You cannot see the "parent" device there.
[root@localhost ~]# ls /sys/devices/virtual/net/
child  lo
--------------

IMHO, it's better not to remount /sys from netns_exec, and let the executed command to take care of it. As "ip netns exec" is just a convenient way of tweaking the namespace, if you need more consistent namespace management, you'd better use the LXC container toolsets.
Comment 3 Petr Šabata 2012-12-04 11:06:11 EST
Removing the remount fixes the issue (obviously).

Given ip-netns(8) manpage, I'd say the current behaviour is intentional.  However, I agree with you this should be probably handled elsewhere as 'netns exec' users won't probably be interested in breaking their systemd mounts every time...
Comment 4 Petr Šabata 2013-01-10 07:51:01 EST
*** Bug 892927 has been marked as a duplicate of this bug. ***
Comment 5 Petr Šabata 2013-01-10 07:52:21 EST
I'm just checking if removing this bit might hit other things or not.
Comment 6 Petr Šabata 2013-01-16 11:09:18 EST
This was caused by mount changes in the new namespace propagated to the parent since /sys mounts were explicitly marked as shared.

A fix remounting the whole cloned tree as private should be upstream soon.
Comment 7 Steve Baker 2013-02-07 18:17:32 EST
I'm blocked on other tasks due to this bug, is there anything I can help test to move progress along?
Comment 8 Petr Šabata 2013-02-08 03:14:25 EST
The patch got accepted upstream, I'll submit an update today.
Comment 9 Fedora Update System 2013-02-08 09:05:26 EST
iproute-3.6.0-6.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/iproute-3.6.0-6.fc18
Comment 10 Etsuji Nakai 2013-02-10 01:09:54 EST
iproute-3.6.0-6.fc18 worked for me. Thanks.
Comment 11 Fedora Update System 2013-02-12 00:07:00 EST
iproute-3.6.0-6.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 12 Steve Baker 2013-02-13 15:45:05 EST
This has worked for me too, thanks very much.
Comment 13 Alan Pevec 2013-02-20 16:16:20 EST
(In reply to comment #11)
> iproute-3.6.0-6.fc18 has been pushed to the Fedora 18 stable repository.  If
> problems still persist, please make note of it in this bug report.

Not sure why Bodhi hasn't changed status to CLOSED->ERRATA ?
Comment 14 Fedora Update System 2013-03-06 09:20:51 EST
iproute-3.3.0-6.fc17 has been submitted as an update for Fedora 17.
https://admin.fedoraproject.org/updates/iproute-3.3.0-6.fc17
Comment 15 Fedora Update System 2013-03-21 20:36:32 EDT
iproute-3.3.0-6.fc17 has been pushed to the Fedora 17 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.