Bug 1569146

Summary: Processes in mount namespaces hang or fail when accessing automount directories
Product: Red Hat Enterprise Linux 7 Reporter: Frank Sorenson <fsorenso>
Component: kernelAssignee: Ian Kent <ikent>
kernel sub component: AutoFS QA Contact: Kun Wang <kunwan>
Status: CLOSED INSUFFICIENT_DATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: dwysocha, jbyrd, riehecky, tbecker, xzhou
Version: 7.4   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-07 17:17:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Frank Sorenson 2018-04-18 17:06:17 UTC
Description of problem:

The kernel implementation of autofs does not support mount namespaces, so when a process in a mount namespace accesses an automount directory, it will either fail or hang.

autofs-managed directories do trigger the mount, however inside the mount namespace, processes will result in a failure with ELOOP.  Filesystems already mounted (and available to the process) when the mount namespace is created will continue to work.

systemd-managed directories hang, without mounting or returning an error


Version-Release number of selected component (if applicable):

seen with multiple kernels... most likely all RHEL 7

details in this bz from kernel 3.10.0-693.11.6.el7.x86_64


How reproducible:

always


Steps to Reproduce:

/etc/auto.master:
/mnt/auto       /etc/auto.test -t 60
 
 
/etc/auto.test:
vm2     vm2:/exports


# unshare --mount /bin/bash

# ls -l /mnt/auto/vm2 | wc -l
ls: cannot open directory /mnt/auto/vm2: Too many levels of symbolic links
0

(** the filesystem is actually mounted, but is not available or visible within the namespace **)

# grep vm2 /proc/$$/mounts /proc/1/mounts | cut -f -3 -d' '
/proc/1/mounts:vm2:/exports /mnt/auto/vm2 nfs4




exit the namespace, and re-enter, and the mounted filesystem is available:
# exit
# unshare --mount /bin/bash

# ls -l /mnt/auto/vm2 | wc -l
50

# grep vm2 /proc/$$/mounts /proc/1/mounts | cut -f -3 -d' '
/proc/6895/mounts:vm2:/exports /mnt/auto/vm2 nfs4
/proc/1/mounts:vm2:/exports /mnt/auto/vm2 nfs4


for systemd-managed autofs:

# umount /proc/sys/fs/binfmt_misc
# ls /proc/sys/fs/binfmt_misc|wc -l
2
# umount /proc/sys/fs/binfmt_misc

# unshare --mount /bin/bash
# ls /proc/sys/fs/binfmt_misc|wc -l
(process hangs)

in another terminal:
# cat /proc/$(pidof ls)/stack
[<ffffffff812948b1>] autofs4_wait+0x341/0x910
[<ffffffff81292bca>] autofs4_mount_wait+0x4a/0xe0
[<ffffffff812935a0>] autofs4_d_automount+0x1a0/0x240
[<ffffffff8120d94c>] follow_managed+0x13c/0x300
[<ffffffff8120e670>] lookup_fast+0x1c0/0x300
[<ffffffff8121194e>] do_last+0x3de/0x12c0
[<ffffffff812128f2>] path_openat+0xc2/0x490
[<ffffffff81214e8b>] do_filp_open+0x4b/0xb0
[<ffffffff812019c3>] do_sys_open+0xf3/0x1f0
[<ffffffff81201af4>] SyS_openat+0x14/0x20
[<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

the filesystem is mounted, but unavailable within the namespace:

# grep binfmt_misc /proc/$(pidof ls)/mounts /proc/1/mounts | cut -f -3 -d' '
/proc/6492/mounts:systemd-1 /proc/sys/fs/binfmt_misc autofs
/proc/1/mounts:systemd-1 /proc/sys/fs/binfmt_misc autofs
/proc/1/mounts:binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc


Actual results:

the filesystem is mounted, but only available outside the namespace.  Within the namespace, attempts to access the filesystem either fail or hang


Expected results:



Additional info:

this is frequently seen from within a docker container (or similar); if the binfmt_misc directory is not mounted when the container is started, processes within the container (or the container itself) can hang

Comment 2 Ian Kent 2018-04-19 02:14:16 UTC
(In reply to Frank Sorenson from comment #0)
> Description of problem:
> 
> The kernel implementation of autofs does not support mount namespaces, so
> when a process in a mount namespace accesses an automount directory, it will
> either fail or hang.

That's not quite right since RHEL-7.4 kernel revision 658.
See bug https://bugzilla.redhat.com/show_bug.cgi?id=1320588.

Comment #65 refers to an article at:
https://access.redhat.com/articles/3104671
which you should read.

> 
> autofs-managed directories do trigger the mount, however inside the mount
> namespace, processes will result in a failure with ELOOP.  Filesystems
> already mounted (and available to the process) when the mount namespace is
> created will continue to work.

Correct, this is what will happen if the user does not set the
appropriate mount propagation or cleanup mounts it doesn't
require at mount name space creation.

autofs doesn't know that a user what's a mount to propagate to
other mount name spaces and even if it did it isn't ok for it
to change this.

This is something users or sub-systems that use mount name
spaces must consider as part of mount name space creation, some
don't, some do, and docker does it a bit differently as well and
the RHEL docker package was broken in this respect the last time
I checked.

> 
> systemd-managed directories hang, without mounting or returning an error

I think that is a systemd problem since systemd uses the kernel
module and is responsible for handling requests it receives from
the autofs module.
 
> 
> 
> Version-Release number of selected component (if applicable):
> 
> seen with multiple kernels... most likely all RHEL 7
> 
> details in this bz from kernel 3.10.0-693.11.6.el7.x86_64
> 
> 
> How reproducible:
> 
> always
> 
> 
> Steps to Reproduce:
> 
> /etc/auto.master:
> /mnt/auto       /etc/auto.test -t 60
>  
>  
> /etc/auto.test:
> vm2     vm2:/exports
> 
> 
> # unshare --mount /bin/bash

This does sound like the created mount name space is propagation
private.

For autofs indirect mounts I believe:
"mount --make-shared /autofs/indirect/mount-point"

done at this point should be enough to permit the automounts
to propagate to this namespace.

autofs direct mounts are more difficult, we can discuss that
too if there is a need to do so but the short story is you
probably want to use autofs indirect mounts to keep it simple
enough to be maintainable.

> 
> # ls -l /mnt/auto/vm2 | wc -l
> ls: cannot open directory /mnt/auto/vm2: Too many levels of symbolic links
> 0

Assuming /mnt/auto/vm2 is an autofs indirect mount:
mount --make-shared /mnt/auto/vm2
should be enough to make this work.

> 
> (** the filesystem is actually mounted, but is not available or visible
> within the namespace **)

That's what I would expect if the mount name space does not
allow mounts to propagate to it.
 
> 
> # grep vm2 /proc/$$/mounts /proc/1/mounts | cut -f -3 -d' '
> /proc/1/mounts:vm2:/exports /mnt/auto/vm2 nfs4
> 
> 
> 
> 
> exit the namespace, and re-enter, and the mounted filesystem is available:
> # exit
> # unshare --mount /bin/bash
> 
> # ls -l /mnt/auto/vm2 | wc -l
> 50
> 
> # grep vm2 /proc/$$/mounts /proc/1/mounts | cut -f -3 -d' '
> /proc/6895/mounts:vm2:/exports /mnt/auto/vm2 nfs4
> /proc/1/mounts:vm2:/exports /mnt/auto/vm2 nfs4

Yes, existing mounts will be cloned at mount name space creation.
Also expected behaviour.

> 
> 
> for systemd-managed autofs:
> 
> # umount /proc/sys/fs/binfmt_misc
> # ls /proc/sys/fs/binfmt_misc|wc -l
> 2
> # umount /proc/sys/fs/binfmt_misc
> 
> # unshare --mount /bin/bash
> # ls /proc/sys/fs/binfmt_misc|wc -l
> (process hangs)
> 
> in another terminal:
> # cat /proc/$(pidof ls)/stack
> [<ffffffff812948b1>] autofs4_wait+0x341/0x910
> [<ffffffff81292bca>] autofs4_mount_wait+0x4a/0xe0
> [<ffffffff812935a0>] autofs4_d_automount+0x1a0/0x240
> [<ffffffff8120d94c>] follow_managed+0x13c/0x300
> [<ffffffff8120e670>] lookup_fast+0x1c0/0x300
> [<ffffffff8121194e>] do_last+0x3de/0x12c0
> [<ffffffff812128f2>] path_openat+0xc2/0x490
> [<ffffffff81214e8b>] do_filp_open+0x4b/0xb0
> [<ffffffff812019c3>] do_sys_open+0xf3/0x1f0
> [<ffffffff81201af4>] SyS_openat+0x14/0x20
> [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
> [<ffffffffffffffff>] 0xffffffffffffffff
> 
> the filesystem is mounted, but unavailable within the namespace:
> 
> # grep binfmt_misc /proc/$(pidof ls)/mounts /proc/1/mounts | cut -f -3 -d' '
> /proc/6492/mounts:systemd-1 /proc/sys/fs/binfmt_misc autofs
> /proc/1/mounts:systemd-1 /proc/sys/fs/binfmt_misc autofs
> /proc/1/mounts:binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc

This case might be more problematic, I'll need more information
but it is likely to be as result of mount propagation as it is
above.

> 
> 
> Actual results:
> 
> the filesystem is mounted, but only available outside the namespace.  Within
> the namespace, attempts to access the filesystem either fail or hang
> 
> 
> Expected results:
> 
> 
> 
> Additional info:
> 
> this is frequently seen from within a docker container (or similar); if the
> binfmt_misc directory is not mounted when the container is started,
> processes within the container (or the container itself) can hang

The article I referred you to above does talk about the docker
case and talks about what needs to be done.

At the time it was written RHEL-7 docker was broken in this
respect but an upstream docker install worked fine.

Ian