Bug 2044981

Summary: Autofs mounts with --ghost or browse_mode=yes enabled, triggers a mount or shows error "ls: cannot access 'XXXX': No such file or directory" when ls -l is run
Product: Red Hat Enterprise Linux 8 Reporter: Rohan Sable <rsable>
Component: coreutilsAssignee: Kamil Dudka <kdudka>
Status: CLOSED ERRATA QA Contact: Radka Brychtova <rskvaril>
Severity: high Docs Contact:
Priority: unspecified    
Version: 8.5CC: bubrown, gordon.stocks, ikent, jke, jshivers, kdudka, xzhou
Target Milestone: rcKeywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: coreutils-8.30-13.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2066199 (view as bug list) Environment:
Last Closed: 2022-11-08 10:53:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2066199    
Bug Blocks:    

Description Rohan Sable 2022-01-25 12:22:21 UTC
Description of problem:

Autofs mounts with --ghost or browse_mode=yes enabled, triggers a mount or shows error "ls: cannot access 'XXXX': No such file or directory" when ls -l is run

Either errors are seen for mount points which we know are inaccessible for this client or
a mount is triggered for accessible mounts.


Version-Release number of selected component (if applicable):
autofs-5.1.4-74.el8.x86_64
coreutils-8.30-12.el8.x86_64

(however, I am starting the bug with autofs as affected component as discussed with Ian)


How reproducible:

Always


Steps to Reproduce:

1. Upgrade to RHEL 8.5 (which should have autofs-5.1.4-74.el8.x86_64 and coreutils-8.30-12.el8.x86_64)
2. Create an autofs map :
~~~
[root@rsablerhel85 mnt2]# grep -i mnt /etc/auto.master
/mnt2 /etc/auto.indirect timeout=600,bg,tcp,hard,vers=3,rsize=32768,wsize=32768,timeo=600,retrans=6

[root@rsablerhel85 mnt2]# cat /etc/auto.indirect 
testshare rsable76server:/testshare               <<<<< testshare is a valid export from server
testshare2 rsable76server:/testshare2             <<<<< testshare2 is not available to this client or could be a bogus entry
~~~
3. Either use --ghost in auto.master as an option or set browse_mode=yes :
~~~
[root@rsablerhel85 mnt2]# grep -i browse /etc/autofs.conf 
# browse_mode - maps are browsable by default.
browse_mode = yes
~~~
4. Cd to /mnt2 and run ls -l / ll.

Note : this issue occurs irrespective of direct or indirect maps.


Actual results:

Mount is triggered and ll throws ENOENT for testshare2
~~~
[root@rsablerhel85 mnt2]# ll
ls: cannot access 'testshare2': No such file or directory     <<<<< Error
total 0
drwxrwxrwx. 3 1000 1000 15 Jan 17 12:08 testshare             <<<<< mount is triggerd for testshare
d?????????? ? ?    ?     ?            ? testshare2            <<<<< Path we know that is inaccessible throws an error

[root@rsablerhel85 mnt2]# mount | grep -i test
rsable76server:/testshare on /mnt2/testshare type nfs (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=192.168.122.58,mountvers=3,mountport=20048,mountproto=tcp,local_lock=none,addr=192.168.122.58)
~~~


Expected results:
Mount should not be trigger and error "ls: cannot access 'testshare2': No such file or directory"
should not be seen.


Additional info:

I think the issue is with a behavior change in coreutils-common-8.30-12.el8.
Reverting back to coreutils-common-8.30-8.el8 this issue goes away :
~~~
[root@rsablerhel85 mnt2]# ll
ls: cannot access 'testshare2': No such file or directory
total 0
drwxrwxrwx. 3 1000 1000 15 Jan 17 12:08 testshare
d?????????? ? ?    ?     ?            ? testshare2

[root@rsablerhel85 mnt2]# dnf downgrade coreutils-8.30-8.el8.x86_64
Downgraded:
  coreutils-8.30-8.el8.x86_64                                     coreutils-common-8.30-8.el8.x86_64                                    

Complete!
[root@rsablerhel85 mnt2]# ll
total 0
drwxrwxrwx. 3 1000 1000 15 Jan 17 12:08 testshare
drwxr-xr-x. 2 root root  0 Jan 21 11:47 testshare2
~~~

I can see that coreutils-common-8.30-12.el8 calls statx while coreutils-common-8.30-8.el8 calls lstat :
~~~
coreutils-8.30-12
3181  12:02:13.828462 getdents64(3, [{d_ino=27279, d_off=1, d_reclen=24, d_type=DT_DIR, d_name="."}, {d_ino=27279, d_off=2, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=27281, d_off=3, d_reclen=32, d_type=DT_DIR, d_name="testshare"}, {d_ino=27280, d_off=4, d_reclen=32, d_type=DT_DIR, d_name="testshare2"}], 32768) = 112 <0.000018>
3181  12:02:14.033318 statx(AT_FDCWD, "testshare2", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW, STATX_MODE|STATX_NLINK|STATX_UID|STATX_GID|STATX_MTIME|STATX_SIZE, 0x7ffc0d6f1c60) = -1 ENOENT (No such file or directory) <0.035781>
~~~
~~~
coreutils-8.30-8
2854  12:01:11.302926 getdents64(3, [{d_ino=27279, d_off=1, d_reclen=24, d_type=DT_DIR, d_name="."}, {d_ino=27279, d_off=2, d_reclen=24, d_type=DT_DIR, d_name=".."}, {d_ino=27281, d_off=3, d_reclen=32, d_type=DT_DIR, d_name="testshare"}, {d_ino=27280, d_off=4, d_reclen=32, d_type=DT_DIR, d_name="testshare2"}], 32768) = 112 <0.000027>
2854  12:01:11.311912 lstat("testshare2", {st_dev=makedev(0, 0x31), st_ino=27280, st_mode=S_IFDIR|0755, st_nlink=2, st_uid=0, st_gid=0, st_blksize=1024, st_blocks=0, st_size=0, st_atime=1642783648 /* 2022-01-21T11:47:28.580732805-0500 */, st_atime_nsec=580732805, st_mtime=1642783648 /* 2022-01-21T11:47:28.580732805-0500 */, st_mtime_nsec=580732805, st_ctime=1642783648 /* 2022-01-21T11:47:28.580732805-0500 */, st_ctime_nsec=580732805}) = 0 <0.000030>
~~~

It seems to me that coreutils-8.30-12 and inherently statx does not pass the flag AT_NO_AUTOMOUNT during this operation.
Checking around a few more it seems that vfs_lstat is just a wrapper to use vfs_statx internally and this explicitly sets AT_NO_AUTOMOUNT :
~~~
3193 static inline int vfs_lstat(const char __user *name, struct kstat *stat)
3194 {
3195         return vfs_statx(AT_FDCWD, name, AT_SYMLINK_NOFOLLOW | AT_NO_AUTOMOUNT,
3196                          stat, STATX_BASIC_STATS);
3197 }
~~~
So it may be just a question of why statx syscall does not use AT_NO_AUTOMOUNT as a flag, unless I am wrong in the last few bits.

Comment 3 Ian Kent 2022-01-26 00:34:42 UTC
Before we go further could you check if this build helps with the problem please:
https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=42539848

Comment 4 Ian Kent 2022-01-26 00:36:19 UTC
(In reply to Ian Kent from comment #3)
> Before we go further could you check if this build helps with the problem
> please:
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=42539848

Given our earlier conversation it probably won't but it's worth trying it.

Comment 5 Rohan Sable 2022-01-26 09:54:09 UTC
(In reply to Ian Kent from comment #3)
> Before we go further could you check if this build helps with the problem
> please:
> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=42539848

As you suspected, this _does not_ resolve the issue :

~~~
[root@rsablerhel85 mnt2]# ll
ls: cannot access 'testshare2': No such file or directory
total 0
drwxrwxrwx. 3 1000 1000 15 Jan 17 12:08 testshare
d?????????? ? ?    ?     ?            ? testshare2

[root@rsablerhel85 mnt2]# cd

[root@rsablerhel85 ~]# wget http://download.eng.bos.redhat.com/brewroot/work/tasks/9859/42539859/autofs-5.1.4-79.el8.x86_64.rpm

[root@rsablerhel85 ~]# dnf install autofs-5.1.4-79.el8.x86_64.rpm 
...
Upgraded:
  autofs-1:5.1.4-79.el8.x86_64                                                                                                           

Complete!

[root@rsablerhel85 ~]# systemctl restart autofs
[root@rsablerhel85 ~]# cd /mnt2
[root@rsablerhel85 mnt2]# ll
ls: cannot access 'testshare2': No such file or directory
total 0
drwxrwxrwx. 3 1000 1000 15 Jan 17 12:08 testshare
d?????????? ? ?    ?     ?            ? testshare2

[root@rsablerhel85 mnt2]# mount | grep -i testshare
rsable76server:/testshare on /mnt2/testshare type nfs (rw,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,proto=tcp,timeo=600,retrans=6,sec=sys,mountaddr=192.168.122.58,mountvers=3,mountport=20048,mountproto=tcp,local_lock=none,addr=192.168.122.58)
~~~

Comment 39 errata-xmlrpc 2022-11-08 10:53:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (coreutils bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7758