Bug 1731529

Summary: RHEL7: autofs hangs when mounting devpts
Product: Red Hat Enterprise Linux 7 Reporter: Jacob Shivers <jshivers>
Component: autofsAssignee: Ian Kent <ikent>
Status: CLOSED CANTFIX QA Contact: Kun Wang <kunwan>
Severity: high Docs Contact:
Priority: unspecified    
Version: 7.5CC: fsorenso, ikent, smazul, xzhou
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-03 23:38:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 9 Murphy Zhou 2019-07-22 03:35:41 UTC
Too late for 7.7

Comment 11 Frank Sorenson 2019-07-22 15:19:47 UTC
'slave' appears to do the trick:

/etc/auto.master:
  /-              /etc/auto.fbe -t 60

/etc/auto.fbe:
  /fbe/env1/proc  -fstype=proc    :/proc
  /fbe/env1/sys   -rw,bind        :/sys
  /fbe/env1/tmp   -rw,bind        :/tmp
  /fbe/env1/dev   -rw,rbind :/dev /pts -fstype=devpts :/dev/pts /shm -fstype=tmpfs :/dev/shm

without 'slave', when accessing /fbe/env1/dev/pts, the hang occurs as described.

However this works:
/etc/auto.master:
  /-              /etc/auto.fbe -t 60,slave

# ls -al /fbe/env1/dev/pts
total 0
drwxr-xr-x  2 root root      0 Jun 24 14:29 .
drwxr-xr-x 19 root root   3320 Jun 24 19:30 ..
crw--w----  1 root tty  136, 0 Jul 19 11:31 0
crw--w----  1 root tty  136, 1 Jul  1 12:46 1
crw--w----  1 root tty  136, 2 Jul 19 11:54 2
crw--w----  1 root tty  136, 3 Jul 16 16:03 3
crw--w----  1 root tty  136, 4 Jul 22 10:07 4
crw--w----  1 root tty  136, 5 Jul 22 10:14 5
crw--w----  1 root tty  136, 6 Jul 19 11:28 6
crw--w----  1 root tty  136, 7 Jul 19 14:51 7
crw--w----  1 root tty  136, 8 Jul 22 09:57 8
crw--w----  1 root tty  136, 9 Jul 22 10:00 9
c---------  1 root root   5, 2 Jun 24 14:29 ptmx

(this testing on upstream, so still should be verified by the customer)


I wonder if there might be a way to detect such a hang, and fail the operation (AB/BA locking is detected, for example)

Comment 14 Ian Kent 2019-07-23 00:55:15 UTC
(In reply to Frank Sorenson from comment #11)
> 'slave' appears to do the trick:
> 
> /etc/auto.master:
>   /-              /etc/auto.fbe -t 60
> 
> /etc/auto.fbe:
>   /fbe/env1/proc  -fstype=proc    :/proc
>   /fbe/env1/sys   -rw,bind        :/sys
>   /fbe/env1/tmp   -rw,bind        :/tmp
>   /fbe/env1/dev   -rw,rbind :/dev /pts -fstype=devpts :/dev/pts /shm
> -fstype=tmpfs :/dev/shm
> 
> without 'slave', when accessing /fbe/env1/dev/pts, the hang occurs as
> described.
> 
> However this works:
> /etc/auto.master:
>   /-              /etc/auto.fbe -t 60,slave
> 
> # ls -al /fbe/env1/dev/pts
> total 0
> drwxr-xr-x  2 root root      0 Jun 24 14:29 .
> drwxr-xr-x 19 root root   3320 Jun 24 19:30 ..
> crw--w----  1 root tty  136, 0 Jul 19 11:31 0
> crw--w----  1 root tty  136, 1 Jul  1 12:46 1
> crw--w----  1 root tty  136, 2 Jul 19 11:54 2
> crw--w----  1 root tty  136, 3 Jul 16 16:03 3
> crw--w----  1 root tty  136, 4 Jul 22 10:07 4
> crw--w----  1 root tty  136, 5 Jul 22 10:14 5
> crw--w----  1 root tty  136, 6 Jul 19 11:28 6
> crw--w----  1 root tty  136, 7 Jul 19 14:51 7
> crw--w----  1 root tty  136, 8 Jul 22 09:57 8
> crw--w----  1 root tty  136, 9 Jul 22 10:00 9
> c---------  1 root root   5, 2 Jun 24 14:29 ptmx
> 
> (this testing on upstream, so still should be verified by the customer)
> 
> 
> I wonder if there might be a way to detect such a hang, and fail the
> operation (AB/BA locking is detected, for example)

The deadlock isn't a lock based problem.

/nobackup/test/dev  \
       -rw,rbind :/dev \
       /pts -fstype=devpts :/dev/pts \
       /shm -fstype=tmpfs :/dev/shm

This is bind mounting /dev onto /nobackup/test/dev, then mounting autofs offsets
on /nobackup/test/dev/pts and /nobackup/test/dev/shm.

The problem occurs because bind mounting each offset causes the autofs offset
mount itself to propagate back to the root as /dev/pts and /dev/shm.

Then when the mount is triggered automount tries to mount a mount target that
is itself an autofs trigger causing a recursive deadlock.

We could detect that and fail but the problem is then the same as the original
report, certain bind mounts don't work any more!

The only way to resolve it is to prevent the unwanted mount propagation from
happening.

I considered making the mounts propagation slave by default but the problem
only occurs in a limited number of cases and I didn't want to risk causing
an unexpected change in behaviour for other mounts that don't need the change.

Ian

Comment 21 Ian Kent 2019-09-03 23:38:43 UTC
I'm closing this CANTFIX because the problem is due to changes that
have been made which essentially make the kernel behave as it is
supposed to and using the autofs fs mount option "slave" or "private"
can be used to change the kernel behaviour to what's needed.