Bug 2023740

Summary:	autofs: send FAIL cmd/ioctl mess when encountering problems with mount trigger
Product:	Red Hat Enterprise Linux 8	Reporter:	Roberto Bergantinos <rbergant>
Component:	autofs	Assignee:	Ian Kent <ikent>
Status:	CLOSED ERRATA	QA Contact:	Kun Wang <kunwan>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	8.4	CC:	xzhou
Target Milestone:	rc	Keywords:	Triaged
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	autofs-5.1.4-75.el8	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	2028301 (view as bug list)		Environment:
Last Closed:	2022-05-10 15:29:06 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2028301

Description Roberto Bergantinos 2021-11-16 12:24:05 UTC

Description of problem:

On do_mount_direct spawned thread, if we encounter some problem with mount trigger i.e. its already mounted 
we follow an incorrect sequence of READY/CLOSE/FAIL commands/ioctl leading to errors EINVAL/EBADF for FAIL commands :

Nov 16 13:04:24 rhel8 automount[35232]: do_mount_direct: direct trigger not valid or already mounted /direct_root
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_ready: token = 41
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_fail: token = 41
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Bad file descriptor

Nov 16 13:06:34 rhel8 automount[38257]: do_mount_direct: direct trigger not valid or already mounted /direct_root
Nov 16 13:06:34 rhel8 automount[38257]: dev_ioctl_send_ready: token = 44
Nov 16 13:06:34 rhel8 automount[38257]: handle_packet: type = 5
Nov 16 13:06:34 rhel8 automount[38257]: handle_packet_missing_direct: token 45, name /direct_root, request pid 38450
Nov 16 13:06:34 rhel8 automount[38257]: dev_ioctl_send_fail: token = 44
Nov 16 13:06:34 rhel8 automount[38257]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Invalid argument


And sometimes hang of commands, see reproducer

#### REPRODUCER ####

Did on 7.9, but its also reproducible on RHEL8 :

1) autofs configured with a simple direct map :

[root@rhel8 ~]# rpm -q autofs ; uname -r
autofs-5.1.4-48.el8.x86_64
4.18.0-305.7.1.el8_4.x86_64

[root@rhel8 ~]# automount -m | grep -A2 auto.direct
  map: /etc/auto.direct

  /direct_root | /	rhel73:/share 	/mydir	rhel73:/otherfs

2) install docker framework and create/run a container :

[root@rhel8 ~]# yum -y install docker

[root@rhel8 ~]# docker run -d  --name="prometheus-node-exporter" --restart="always" --pid="host"   -v "/proc:/host/proc:ro" -v "/sys:/host/sys:ro" -v "/:/rootfs:ro" -v "/etc/hostname:/etc/host_hostname:ro"  --network="bridge" quay.io/prometheus/node-exporter:v0.17.0   --path.rootfs /rootfs --path.procfs /host/proc --path.sysfs /host/sys --collector.filesystem.ignored-mount-points "^/(sys|proc|dev|host|etc|rootfs\/var\/lib\/docker)($|/)"


[root@rhel8 ~]# docker ps
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
CONTAINER ID  IMAGE                                     COMMAND               CREATED        STATUS            PORTS       NAMES
1af2cf9409b7  quay.io/prometheus/node-exporter:v0.17.0  --path.rootfs /ro...  4 minutes ago  Up 4 minutes ago              prometheus-node-exporter


3) trigger mount of direct mount :

[root@rhel8 ~]# cd /direct_root
[root@rhel8 direct_root]# grep nfs /proc/mounts 
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw,relatime 0 0
rhel73:/share /direct_root nfs4 rw,relatime,vers=4.2,rsize=262144,wsize=262144,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=172.23.1.193,local_lock=none,addr=172.23.1.225 0 0


4) try to access from container too :

[root@rhel8 direct_root]# docker exec -it prometheus-node-exporter ls /rootfs/direct_root
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
ls: can't open '/rootfs/direct_root': Too many levels of symbolic links


and (debug userland autofs) :

Nov 16 13:04:24 rhel8 automount[35232]: handle_packet_missing_direct: token 4, name /direct_root, request pid 37843
Nov 16 13:04:24 rhel8 automount[35232]: do_mount_direct: direct trigger not valid or already mounted /direct_root       <<<<-------
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_ready: token = 4
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_fail: token = 4
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Bad file descriptor  <<-----
Nov 16 13:04:24 rhel8 automount[35232]: handle_packet: type = 5
Nov 16 13:04:24 rhel8 automount[35232]: handle_packet_missing_direct: token 5, name /direct_root, request pid 37843
Nov 16 13:04:24 rhel8 automount[35232]: do_mount_direct: direct trigger not valid or already mounted /direct_root
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_ready: token = 5
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_fail: token = 5
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Bad file descriptor
Nov 16 13:04:24 rhel8 automount[35232]: handle_packet: type = 5
Nov 16 13:04:24 rhel8 automount[35232]: handle_packet_missing_direct: token 6, name /direct_root, request pid 37843
Nov 16 13:04:24 rhel8 automount[35232]: do_mount_direct: direct trigger not valid or already mounted /direct_root
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_ready: token = 6
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_fail: token = 6
Nov 16 13:04:24 rhel8 automount[35232]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Bad file descriptor
Nov 16 13:04:24 rhel8 automount[35232]: handle_packet: type = 5

or sometimes hang of the command :

[root@rhel8 direct_root]# docker exec -it prometheus-node-exporter ls /rootfs/direct_root
<<--- hang

[root@rhel8 ~]# ps -elfL | awk '($13 ~ /autofs/){print $0}'
4 D nobody     38450   38439   38450  0    1  80   0 -  2584 autofs 13:06 pts/0    00:00:00 ls /rootfs/direct_root
[root@rhel8 ~]# cat /proc/38450/stack
[<0>] autofs_wait+0x320/0x7a0
[<0>] autofs_mount_wait+0x49/0xf0
[<0>] autofs_d_automount+0xe6/0x1f0
[<0>] follow_managed+0x17b/0x2f0
[<0>] lookup_fast+0x131/0x2a0
[<0>] path_openat+0x138/0x14f0
[<0>] do_filp_open+0x93/0x100
[<0>] do_sys_open+0x184/0x220
[<0>] do_syscall_64+0x5b/0x1a0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca

and :

Nov 16 13:06:34 rhel8 automount[38257]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Invalid argument
Nov 16 13:06:34 rhel8 automount[38257]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Bad file descriptor
Nov 16 13:06:34 rhel8 automount[38257]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Bad file descriptor

even other commands hang as well from this point :

[root@rhel8 ~]# df &
[1] 38771
[root@rhel8 ~]# ps -elfL | awk '($13 ~ /autofs/){print $0}'
4 D nobody     38450   38439   38450  0    1  80   0 -  2584 autofs 13:06 pts/0    00:00:00 ls /rootfs/direct_root
0 D root       38771    2363   38771  0    1  80   0 -  1836 autofs 13:09 pts/1    00:00:00 df
[root@rhel8 ~]# cat /proc/38771/stack
[<0>] autofs_wait+0x320/0x7a0
[<0>] autofs_mount_wait+0x49/0xf0
[<0>] autofs_d_manage+0x85/0x1a0
[<0>] follow_managed+0xaf/0x2f0
[<0>] lookup_fast+0x131/0x2a0
[<0>] walk_component+0x48/0x2f0
[<0>] path_lookupat.isra.49+0x75/0x200
[<0>] filename_lookup.part.65+0xa0/0x170
[<0>] vfs_statx+0x74/0xd0
[<0>] __do_sys_newstat+0x39/0x70
[<0>] do_syscall_64+0x5b/0x1a0
[<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca


#### REPRODUCER ###

#### ANALYSIS ####

The fundamental problem i see is for direct mount case (maybe for indirect case too) is this issue with FAIL
commands/ioctl when we encounter a problem with the trigger :


Nov 12 17:04:09 rhel77 automount[24185]: dev_ioctl_send_fail: token = 10
Nov 12 17:04:09 rhel77 automount[24185]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Invalid argument

or 

Nov 12 17:04:09 rhel77 automount[24185]: dev_ioctl_send_fail: token = 11
Nov 12 17:04:09 rhel77 automount[24185]: dev_ioctl_send_fail:499: AUTOFS_DEV_IOCTL_FAIL: error Bad file descriptor

since on userland we have this push routine (mount_send_fail) on spawned thread that performs do_mount_direct 
, mount_send_fail that get triggered when thread exits for some of the 2 following situations :

1134 static void *do_mount_direct(void *arg)
...
1159         pthread_cleanup_push(mount_send_fail, &mt);
...
1169         status = fstat(mt.ioctlfd, &st);
1170         if (status == -1) {
1171                 error(ap->logopt,
1172                       "can't stat direct mount trigger %s", mt.name);
1173                 ops->send_fail(ap->logopt,
1174                                mt.ioctlfd, mt.wait_queue_token, -ENOENT);
1175                 ops->close(ap->logopt, mt.ioctlfd);  <<<<---- close #1 (for can't stat error)
1176                 pthread_setcancelstate(state, NULL);
1177                 pthread_exit(NULL);
1178         }
1179 
1180         status = stat(mt.name, &st);
1181         if (status != 0 || !S_ISDIR(st.st_mode) || st.st_dev != mt.dev) {
1182                 error(ap->logopt,
1183                      "direct trigger not valid or already mounted %s",
1184                      mt.name);
1185                 ops->send_ready(ap->logopt, mt.ioctlfd, mt.wait_queue_token);
1186                 ops->close(ap->logopt, mt.ioctlfd);  <<<<---- close #1 (for invalid or mounted trigger)
1187                 pthread_setcancelstate(state, NULL);
1188                 pthread_exit(NULL);
1189         }

But then on mount_send_fail :


1125 static void mount_send_fail(void *arg)
1126 {
1127         struct ioctl_ops *ops = get_ioctl_ops();
1128         struct pending_args *mt = arg;
1129         struct autofs_point *ap = mt->ap;
1130         ops->send_fail(ap->logopt, mt->ioctlfd, mt->wait_queue_token, -ENOENT); ---> mmm this is already closed !
1131         ops->close(ap->logopt, mt->ioctlfd); <<<<---- close #2 --> ???
1132 }

request pid is gonna re-try again, sometimes ending in hang, sometimes on ELOOP, i guess depending how it races with
automount threads processing the commands.

this patch seems to do the trick, i don't see the point of sending both READY and FAIL commands for invalid/already mounted :

diff --git a/daemon/direct.c b/daemon/direct.c
index 8c629dd..838f66b 100644
--- a/daemon/direct.c
+++ b/daemon/direct.c
@@ -1170,9 +1170,6 @@ static void *do_mount_direct(void *arg)
 	if (status == -1) {
 		error(ap->logopt,
 		      "can't stat direct mount trigger %s", mt.name);
-		ops->send_fail(ap->logopt,
-			       mt.ioctlfd, mt.wait_queue_token, -ENOENT);
-		ops->close(ap->logopt, mt.ioctlfd);
 		pthread_setcancelstate(state, NULL);
 		pthread_exit(NULL);
 	}
@@ -1182,8 +1179,6 @@ static void *do_mount_direct(void *arg)
 		error(ap->logopt,
 		     "direct trigger not valid or already mounted %s",
 		     mt.name);
-		ops->send_ready(ap->logopt, mt.ioctlfd, mt.wait_queue_token);
-		ops->close(ap->logopt, mt.ioctlfd);
 		pthread_setcancelstate(state, NULL);
 		pthread_exit(NULL);
 	}
-- 

that, or skip it on mount_send_fail and treat differently both cases if we still want to send a READY for 2nd case

indirect case does not close the ap, but we may still need some consideration

#### ANALYSIS ####

Version-Release number of selected component (if applicable):


How reproducible: 100%


Steps to Reproduce: See above
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Ian Kent 2021-11-17 03:00:55 UTC

Ok, that's not what I should be doing there.

But there's quite a bit of other stuff in the bug comment so is this
simply requesting I fix to the error handling or is there more about
this you'd like to discuss?

Comment 2 Roberto Bergantinos 2021-11-17 08:24:13 UTC

(In reply to Ian Kent from comment #1)
> Ok, that's not what I should be doing there.
> 
> But there's quite a bit of other stuff in the bug comment so is this
> simply requesting I fix to the error handling or is there more about
> this you'd like to discuss?

Hi Ian

I think we should fix error handling. It requires a complicated setup to be 
triggered i admit, but i saw it on the field.

about a possible patch i was confused about sending READY command for "invalid trigger or already mounted" case,
we end up sending FAIL too due to push routine. So if we're ment to send READY cmd, i think we can simply get rid
push routine for that case, not sure what you think about that.

rgds
roberto

Comment 3 Ian Kent 2021-11-17 09:18:22 UTC

(In reply to Roberto Bergantinos from comment #2)
> (In reply to Ian Kent from comment #1)
> > Ok, that's not what I should be doing there.
> > 
> > But there's quite a bit of other stuff in the bug comment so is this
> > simply requesting I fix to the error handling or is there more about
> > this you'd like to discuss?
> 
> Hi Ian
> 
> I think we should fix error handling. It requires a complicated setup to be 
> triggered i admit, but i saw it on the field.
> 
> about a possible patch i was confused about sending READY command for
> "invalid trigger or already mounted" case,
> we end up sending FAIL too due to push routine. So if we're ment to send
> READY cmd, i think we can simply get rid
> push routine for that case, not sure what you think about that.

We must send a status to the kernel but it needs to be the correct one.
I definitely shouldn't be sending any status twice for the same mount
attempt, I'll fix that.

Comment 4 Ian Kent 2021-11-17 09:22:22 UTC

(In reply to Roberto Bergantinos from comment #2)
> (In reply to Ian Kent from comment #1)
> > Ok, that's not what I should be doing there.
> > 
> > But there's quite a bit of other stuff in the bug comment so is this
> > simply requesting I fix to the error handling or is there more about
> > this you'd like to discuss?
> 
> Hi Ian
> 
> I think we should fix error handling. It requires a complicated setup to be 
> triggered i admit, but i saw it on the field.

What was being done in the container wasn't right, I doubt that
will work even if the volume option and it's parameters were closer
to being what's needed.

We can talk about that too after the status sending is fixed.

Comment 5 Roberto Bergantinos 2021-11-17 10:02:57 UTC

(In reply to Ian Kent from comment #4)
> (In reply to Roberto Bergantinos from comment #2)
> > (In reply to Ian Kent from comment #1)
> > > Ok, that's not what I should be doing there.
> > > 
> > > But there's quite a bit of other stuff in the bug comment so is this
> > > simply requesting I fix to the error handling or is there more about
> > > this you'd like to discuss?
> > 
> > Hi Ian
> > 
> > I think we should fix error handling. It requires a complicated setup to be 
> > triggered i admit, but i saw it on the field.
> 
> What was being done in the container wasn't right, I doubt that
> will work even if the volume option and it's parameters were closer
> to being what's needed.
> 
> We can talk about that too after the status sending is fixed.

sure. On field this was "fixed" upgrading the container to a newer version, so something was
not right on that part indeed, but i don't know other way to expose automount error handling problem.

Comment 6 Ian Kent 2021-11-18 02:23:27 UTC

There's build:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=41376076

It has a patch that I believe will handle the fails (hopefully) properly.
Could you give it a try with your reproducer environment please.

Keep in mind that automounting in the container with that volume spec
shouldn't actually work but I think that's what lead to discovering
the error handling isn't ok.

Comment 8 Roberto Bergantinos 2021-11-18 08:41:54 UTC

(In reply to Ian Kent from comment #6)
> There's build:
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=41376076
> 
> It has a patch that I believe will handle the fails (hopefully) properly.
> Could you give it a try with your reproducer environment please.
> 
> Keep in mind that automounting in the container with that volume spec
> shouldn't actually work but I think that's what lead to discovering
> the error handling isn't ok.

tested on reproducer i get ELOOP :

[root@rhel8 direct_root]# docker exec -it prometheus-node-exporter ls /rootfs/direct_root
ls: can't open '/rootfs/direct_root': Too many levels of symbolic links

error handling is correct, i don't see ioctl/cmd failing

Nov 18 09:37:49 rhel8 automount[81037]: handle_packet_missing_direct: token 47, name /direct_root, request pid 81184
Nov 18 09:37:49 rhel8 automount[81037]: do_mount_direct: direct trigger not valid or already mounted /direct_root
Nov 18 09:37:49 rhel8 automount[81037]: dev_ioctl_send_ready: token = 47
...
Nov 18 09:37:49 rhel8 automount[81037]: handle_packet_missing_direct: token 85, name /direct_root, request pid 81184
Nov 18 09:37:49 rhel8 automount[81037]: do_mount_direct: direct trigger not valid or already mounted /direct_root
Nov 18 09:37:49 rhel8 automount[81037]: dev_ioctl_send_ready: token = 85

so i think purpose of fixing error handling is fulfilled

Comment 9 Ian Kent 2021-11-19 02:13:41 UTC

(In reply to Roberto Bergantinos from comment #8)
> (In reply to Ian Kent from comment #6)
> > There's build:
> > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=41376076
> > 
> > It has a patch that I believe will handle the fails (hopefully) properly.
> > Could you give it a try with your reproducer environment please.
> > 
> > Keep in mind that automounting in the container with that volume spec
> > shouldn't actually work but I think that's what lead to discovering
> > the error handling isn't ok.
> 
> tested on reproducer i get ELOOP :
> 
> [root@rhel8 direct_root]# docker exec -it prometheus-node-exporter ls
> /rootfs/direct_root
> ls: can't open '/rootfs/direct_root': Too many levels of symbolic links
> 
> error handling is correct, i don't see ioctl/cmd failing

There are some potential catches with using automounts within containers.

In this case it's probably because the volume in the container is set to
propagation private and automounted mounts aren't propagated to it.

If the mounts don't propagate properly automount will return success
but the mount in the container mount namespace won't appear so that
following the mount in the container will loop until it gets an ELOOP
error. There's not really anything that can be done about that because
the mount has succeeded, it's just not visible in the mount namespace
that triggered it, and there's no other sensible error return there,
and the kernel can't know that the mount hasn't propagated only that
it isn't present.

That '-v "/:/rootfs:ro"' probably should be '-v "/:/rootfs:ro,slave"'
so that mounts in the host mount namespace propagate to the container
mount namespace only (not back from the container to the host) for a
start. But in this case any mount done in / would propagate which is
usually not what you want so having the direct mounts under at least
one higher level directory would be better but probably not much
unless that higher level directory is a distinct file system. That's
because if that sub-directory is within the / file system I think it's
still going to be propagating from /.

Consequently, for direct mounts to be isolated in the container and
to work properly, you would need to add -v options for each individual
direct mount map entry and that's usually not workable for direct mount
maps that have anything but a trivial number of map entries or only a
small subset of mounts are needed in the container.

autofs indirect mount maps are a much better choice for container
use because there's a distinct top level mount which can be specified
with the volume option so that only automounts under the autofs managed
directory are seen in the container.

Also mapping / to a container sub-directory is probably not the
best choice.

There's also the paths used, mostly the path used should be the
same in the container as it is in the host.

I'm not certain but I think using a different path in the
container to that in the host for direct mounts will work ok.
That's because the automount map lookup will use the inode number
to lookup an index to get the path then use that to lookup the
map entry. But, at this time, indirect mounts use the path
returned by the kernel to lookup the map entry directly.

If we get demand for it I guess I could change indirect mounts
to use an inode number index too so they are independent of path
but the question then becomes why, "don't you want consistent path
usage everywhere?".

> 
> Nov 18 09:37:49 rhel8 automount[81037]: handle_packet_missing_direct: token
> 47, name /direct_root, request pid 81184
> Nov 18 09:37:49 rhel8 automount[81037]: do_mount_direct: direct trigger not
> valid or already mounted /direct_root
> Nov 18 09:37:49 rhel8 automount[81037]: dev_ioctl_send_ready: token = 47
> ...
> Nov 18 09:37:49 rhel8 automount[81037]: handle_packet_missing_direct: token
> 85, name /direct_root, request pid 81184
> Nov 18 09:37:49 rhel8 automount[81037]: do_mount_direct: direct trigger not
> valid or already mounted /direct_root
> Nov 18 09:37:49 rhel8 automount[81037]: dev_ioctl_send_ready: token = 85
> 
> so i think purpose of fixing error handling is fulfilled

Sounds good, we'll use this patch then, thanks.

Comment 10 Ian Kent 2021-11-19 02:21:22 UTC

(In reply to Ian Kent from comment #9)
> 
> If we get demand for it I guess I could change indirect mounts
> to use an inode number index too so they are independent of path
> but the question then becomes why, "don't you want consistent path
> usage everywhere?".

I must also add that when using inode number to get the map path
there are failure cases that can leave autofs in an unrecoverable
state and so far I've been unable to work out a way to deal with
that.

Comment 11 Ian Kent 2021-12-01 07:54:15 UTC

The tier1 CI test still fails, I'm pretty sure it's because of those:
ls: cannot access '/usr/bin/python2': No such file or directory
/usr/bin/env: 'python': No such file or directory

IIRC a python script is used to account for expected test return variations.
I'll run the CI tests via beaker manually and see what I get.

Comment 18 errata-xmlrpc 2022-05-10 15:29:06 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (autofs bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:2084