Bug 745782

Summary: Unable to unmount autofs filesystems inside a container
Product: [Fedora] Fedora Reporter: Daniel Berrangé <berrange>
Component: kernelAssignee: Ian Kent <ikent>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: dhowells, gansalmon, ikent, itamar, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-11-21 11:22:52 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Demo unmount inside a container none

Description Daniel Berrangé 2011-10-13 12:18:49 UTC
Description of problem:
One of the first things that is required when setting up a container with a
private root filesystem, is to unmount all filesystems inherited from the host
OS.

For most filesystems this works without trouble, but for autofs this is not the
case.

For demonstration, consider an autofs map with one direct mount and one
indirect mount:

# cat /etc/auto.master 
/net -hosts
/-        /etc/auto.marrow

# cat /etc/auto.marrow 
/mnt/demo marrow.example.com:/var/lib/libvirt/images

When autofs initially starts these mount points are visible

# grep -E ' (/net|/mnt)' /proc/mounts 
-hosts /net autofs
rw,relatime,fd=6,pgrp=2938,timeout=300,minproto=5,maxproto=5,indirect 0 0
/etc/auto.marrow /mnt/demo autofs
rw,relatime,fd=12,pgrp=2938,timeout=300,minproto=5,maxproto=5,direct 0 0


The attached demo program creates a container and attempts to unmount the
requested filesystem inside the container.


With a direct mount, which has not yet been triggered it fails:

# ./autofsdemo /mnt/demo
We are the parent
We are the container!
Found mount point 1 /mnt/demo auto.marrow autofs rw,relatime,fd=13,pgrp=1363,timeout=300,minproto=5,maxproto=5,direct
Umount point 1 /mnt/demo
Could not umount /mnt/demo: Operation not permitted


If I then trigger the mount 

# ls /mnt/demo
debian-6.0.2.1-amd64-netinst.iso  debian6-x86_64.img  f16_x86_64.img  migtest.img  rhel6x86_64.img  sanlock  tck

# grep /mnt/demo /proc/mounts 
auto.marrow /mnt/demo autofs rw,relatime,fd=13,pgrp=1363,timeout=300,minproto=5,maxproto=5,direct 0 0
marrow.gsslab.fab.redhat.com:/var/lib/libvirt/images/ /mnt/demo nfs rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.33.8.114,mountvers=3,mountport=35386,mountproto=udp,local_lock=none,addr=10.33.8.114 0 0

# ./autofsdemo /mnt/demo
We are the parent
We are the container!
Found mount point 1 /mnt/demo auto.marrow autofs rw,relatime,fd=13,pgrp=1363,timeout=300,minproto=5,maxproto=5,direct
Found mount point 2 /mnt/demo marrow.gsslab.fab.redhat.com:/var/lib/libvirt/images/ nfs rw,relatime,vers=3,rsize=524288,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.33.8.114,mountvers=3,mountport=35386,mountproto=udp,local_lock=none,addr=10.33.8.114
Umount point 2 /mnt/demo
Could not umount /mnt/demo: Operation not permitted

We see I can't even unmount the NFS server mount, let alone the autofs mount.

I believe indirect mounts will have similar problems, but I am currently unable to check that, pending resolution of  bug 745781

Attempting to use umount2() with MNT_DETACH also fails, so I can't simply hide the mounts from the container either.


Version-Release number of selected component (if applicable):
kernel-3.1.0-0.rc9.git0.0.fc16.x86_64
autofs-5.0.6-2.fc16.x86_64

How reproducible:
Always

Steps to Reproduce:
1. See above + attached demo program

Comment 1 Daniel Berrangé 2011-10-13 12:19:40 UTC
Created attachment 527963 [details]
Demo unmount inside a container

Comment 3 Josh Boyer 2011-10-13 12:59:40 UTC
Have you sent a query on this upstream at all?

Comment 4 Daniel Berrangé 2011-10-13 13:15:13 UTC
I have mentioned this in private email to Ian Kent a few weeks back.

Comment 5 Josh Boyer 2011-10-13 14:27:21 UTC
Adding Ian and David to CC.

Comment 6 Ian Kent 2011-11-21 03:45:30 UTC
A little more investigation shows that this is where the umount
failure occurs:

SYSCALL_DEFINE2(umount, char __user *, name, int, flags)
{
        .....
        retval = -EPERM;
        if (!capable(CAP_SYS_ADMIN)) {
                printk(KERN_INFO "umount: CAP_SYS_ADMIN check failed\n");
                goto dput_and_out;
        }
        ...
}

The printk() is mine and triggers when I run the test.

It seems to me that root user does not inherit CAP_SYS_ADMIN over
the clone() call. The same test is done during mount so that must
be forbidden as well.

I don't doubt there will be other challenges but we can't even
start to work out what they are if umount and mount are denied
by the CAP_SYS_ADMIN check.

A quick Goolge search shows that this difficulty is well known
but I didn't see any sensible way of overcoming it. Mind you the
date on posts was some months ago and that may have changed since.

Ian

Comment 7 Daniel Berrangé 2011-11-21 10:09:59 UTC
> It seems to me that root user does not inherit CAP_SYS_ADMIN over
> the clone() call.

This is very odd, capabilities are untouched across clone(), and containers definitely have the ability to mount()/umount() filesystems because in libvirt, by the time we hit the autofs problem, we've already mounted & unmount many many other filesystems since the clone() call.

Comment 8 Daniel Berrangé 2011-11-21 10:18:24 UTC
> This is very odd, capabilities are untouched across clone(), 


Arrrrrrrrrrrggggh, my fault :-(

My demo program has the CLONE_NEWUSER flag set. Prior to 3.x kernels, this was effectively a no-op, but it now has special meaning wrt containers.

Please modify the demo program in this BZ to remove the CLONE_NEWUSER flag, and you should then see the real autofs problem.

Comment 9 Daniel Berrangé 2011-11-21 11:22:52 UTC
Ok, after fixing the demo program, it appears that there is *not* any autofs problem in the kernel 3.1 or later.  It must have been fixed sometime between 3.0 (where I originally saw the problem) and 3.1. Sorry for the noise.

Comment 10 Ian Kent 2011-11-21 12:42:51 UTC
(In reply to comment #9)
> Ok, after fixing the demo program, it appears that there is *not* any autofs
> problem in the kernel 3.1 or later.  It must have been fixed sometime between
> 3.0 (where I originally saw the problem) and 3.1. Sorry for the noise.

Yes, there was a change that I believe went into 3.1.0-rc9.

This is the only other (we had one other) reported problem caused
by the initial vfs-automount implementation. That means I'll need
to remember this if (when) I lobby to re-introduce that semantic
behaviour.

Thanks, is there anything else we need to investigate?

Comment 11 Daniel Berrangé 2011-11-21 13:16:12 UTC
WRT to F16/rawhide I believe we're all OK with autofs + LXC now.