Bug 2023740
Summary: | autofs: send FAIL cmd/ioctl mess when encountering problems with mount trigger | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Roberto Bergantinos <rbergant> | |
Component: | autofs | Assignee: | Ian Kent <ikent> | |
Status: | CLOSED ERRATA | QA Contact: | Kun Wang <kunwan> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 8.4 | CC: | xzhou | |
Target Milestone: | rc | Keywords: | Triaged | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | autofs-5.1.4-75.el8 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2028301 (view as bug list) | Environment: | ||
Last Closed: | 2022-05-10 15:29:06 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2028301 |
Description
Roberto Bergantinos
2021-11-16 12:24:05 UTC
Ok, that's not what I should be doing there. But there's quite a bit of other stuff in the bug comment so is this simply requesting I fix to the error handling or is there more about this you'd like to discuss? (In reply to Ian Kent from comment #1) > Ok, that's not what I should be doing there. > > But there's quite a bit of other stuff in the bug comment so is this > simply requesting I fix to the error handling or is there more about > this you'd like to discuss? Hi Ian I think we should fix error handling. It requires a complicated setup to be triggered i admit, but i saw it on the field. about a possible patch i was confused about sending READY command for "invalid trigger or already mounted" case, we end up sending FAIL too due to push routine. So if we're ment to send READY cmd, i think we can simply get rid push routine for that case, not sure what you think about that. rgds roberto (In reply to Roberto Bergantinos from comment #2) > (In reply to Ian Kent from comment #1) > > Ok, that's not what I should be doing there. > > > > But there's quite a bit of other stuff in the bug comment so is this > > simply requesting I fix to the error handling or is there more about > > this you'd like to discuss? > > Hi Ian > > I think we should fix error handling. It requires a complicated setup to be > triggered i admit, but i saw it on the field. > > about a possible patch i was confused about sending READY command for > "invalid trigger or already mounted" case, > we end up sending FAIL too due to push routine. So if we're ment to send > READY cmd, i think we can simply get rid > push routine for that case, not sure what you think about that. We must send a status to the kernel but it needs to be the correct one. I definitely shouldn't be sending any status twice for the same mount attempt, I'll fix that. (In reply to Roberto Bergantinos from comment #2) > (In reply to Ian Kent from comment #1) > > Ok, that's not what I should be doing there. > > > > But there's quite a bit of other stuff in the bug comment so is this > > simply requesting I fix to the error handling or is there more about > > this you'd like to discuss? > > Hi Ian > > I think we should fix error handling. It requires a complicated setup to be > triggered i admit, but i saw it on the field. What was being done in the container wasn't right, I doubt that will work even if the volume option and it's parameters were closer to being what's needed. We can talk about that too after the status sending is fixed. (In reply to Ian Kent from comment #4) > (In reply to Roberto Bergantinos from comment #2) > > (In reply to Ian Kent from comment #1) > > > Ok, that's not what I should be doing there. > > > > > > But there's quite a bit of other stuff in the bug comment so is this > > > simply requesting I fix to the error handling or is there more about > > > this you'd like to discuss? > > > > Hi Ian > > > > I think we should fix error handling. It requires a complicated setup to be > > triggered i admit, but i saw it on the field. > > What was being done in the container wasn't right, I doubt that > will work even if the volume option and it's parameters were closer > to being what's needed. > > We can talk about that too after the status sending is fixed. sure. On field this was "fixed" upgrading the container to a newer version, so something was not right on that part indeed, but i don't know other way to expose automount error handling problem. There's build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=41376076 It has a patch that I believe will handle the fails (hopefully) properly. Could you give it a try with your reproducer environment please. Keep in mind that automounting in the container with that volume spec shouldn't actually work but I think that's what lead to discovering the error handling isn't ok. (In reply to Ian Kent from comment #6) > There's build: > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=41376076 > > It has a patch that I believe will handle the fails (hopefully) properly. > Could you give it a try with your reproducer environment please. > > Keep in mind that automounting in the container with that volume spec > shouldn't actually work but I think that's what lead to discovering > the error handling isn't ok. tested on reproducer i get ELOOP : [root@rhel8 direct_root]# docker exec -it prometheus-node-exporter ls /rootfs/direct_root ls: can't open '/rootfs/direct_root': Too many levels of symbolic links error handling is correct, i don't see ioctl/cmd failing Nov 18 09:37:49 rhel8 automount[81037]: handle_packet_missing_direct: token 47, name /direct_root, request pid 81184 Nov 18 09:37:49 rhel8 automount[81037]: do_mount_direct: direct trigger not valid or already mounted /direct_root Nov 18 09:37:49 rhel8 automount[81037]: dev_ioctl_send_ready: token = 47 ... Nov 18 09:37:49 rhel8 automount[81037]: handle_packet_missing_direct: token 85, name /direct_root, request pid 81184 Nov 18 09:37:49 rhel8 automount[81037]: do_mount_direct: direct trigger not valid or already mounted /direct_root Nov 18 09:37:49 rhel8 automount[81037]: dev_ioctl_send_ready: token = 85 so i think purpose of fixing error handling is fulfilled (In reply to Roberto Bergantinos from comment #8) > (In reply to Ian Kent from comment #6) > > There's build: > > https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=41376076 > > > > It has a patch that I believe will handle the fails (hopefully) properly. > > Could you give it a try with your reproducer environment please. > > > > Keep in mind that automounting in the container with that volume spec > > shouldn't actually work but I think that's what lead to discovering > > the error handling isn't ok. > > tested on reproducer i get ELOOP : > > [root@rhel8 direct_root]# docker exec -it prometheus-node-exporter ls > /rootfs/direct_root > ls: can't open '/rootfs/direct_root': Too many levels of symbolic links > > error handling is correct, i don't see ioctl/cmd failing There are some potential catches with using automounts within containers. In this case it's probably because the volume in the container is set to propagation private and automounted mounts aren't propagated to it. If the mounts don't propagate properly automount will return success but the mount in the container mount namespace won't appear so that following the mount in the container will loop until it gets an ELOOP error. There's not really anything that can be done about that because the mount has succeeded, it's just not visible in the mount namespace that triggered it, and there's no other sensible error return there, and the kernel can't know that the mount hasn't propagated only that it isn't present. That '-v "/:/rootfs:ro"' probably should be '-v "/:/rootfs:ro,slave"' so that mounts in the host mount namespace propagate to the container mount namespace only (not back from the container to the host) for a start. But in this case any mount done in / would propagate which is usually not what you want so having the direct mounts under at least one higher level directory would be better but probably not much unless that higher level directory is a distinct file system. That's because if that sub-directory is within the / file system I think it's still going to be propagating from /. Consequently, for direct mounts to be isolated in the container and to work properly, you would need to add -v options for each individual direct mount map entry and that's usually not workable for direct mount maps that have anything but a trivial number of map entries or only a small subset of mounts are needed in the container. autofs indirect mount maps are a much better choice for container use because there's a distinct top level mount which can be specified with the volume option so that only automounts under the autofs managed directory are seen in the container. Also mapping / to a container sub-directory is probably not the best choice. There's also the paths used, mostly the path used should be the same in the container as it is in the host. I'm not certain but I think using a different path in the container to that in the host for direct mounts will work ok. That's because the automount map lookup will use the inode number to lookup an index to get the path then use that to lookup the map entry. But, at this time, indirect mounts use the path returned by the kernel to lookup the map entry directly. If we get demand for it I guess I could change indirect mounts to use an inode number index too so they are independent of path but the question then becomes why, "don't you want consistent path usage everywhere?". > > Nov 18 09:37:49 rhel8 automount[81037]: handle_packet_missing_direct: token > 47, name /direct_root, request pid 81184 > Nov 18 09:37:49 rhel8 automount[81037]: do_mount_direct: direct trigger not > valid or already mounted /direct_root > Nov 18 09:37:49 rhel8 automount[81037]: dev_ioctl_send_ready: token = 47 > ... > Nov 18 09:37:49 rhel8 automount[81037]: handle_packet_missing_direct: token > 85, name /direct_root, request pid 81184 > Nov 18 09:37:49 rhel8 automount[81037]: do_mount_direct: direct trigger not > valid or already mounted /direct_root > Nov 18 09:37:49 rhel8 automount[81037]: dev_ioctl_send_ready: token = 85 > > so i think purpose of fixing error handling is fulfilled Sounds good, we'll use this patch then, thanks. (In reply to Ian Kent from comment #9) > > If we get demand for it I guess I could change indirect mounts > to use an inode number index too so they are independent of path > but the question then becomes why, "don't you want consistent path > usage everywhere?". I must also add that when using inode number to get the map path there are failure cases that can leave autofs in an unrecoverable state and so far I've been unable to work out a way to deal with that. The tier1 CI test still fails, I'm pretty sure it's because of those: ls: cannot access '/usr/bin/python2': No such file or directory /usr/bin/env: 'python': No such file or directory IIRC a python script is used to account for expected test return variations. I'll run the CI tests via beaker manually and see what I get. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (autofs bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2084 |