Bug 2116395

Summary: NFS krb5 mount failed as "access denied" after test accessing a same file on krb5 nfs mount with multiple uids simultaneously since sssd-2.7.3-1.el8
Product: Red Hat Enterprise Linux 8 Reporter: Yongcheng Yang <yoyang>
Component: sssdAssignee: Alexey Tikhonov <atikhono>
Status: CLOSED ERRATA QA Contact: Yongcheng Yang <yoyang>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 8.7CC: atikhono, fsorenso, pbrezina, sssd-maint, sssd-qe, steved, xzhou
Target Milestone: rcKeywords: Regression, Reproducer, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: sync-to-jira
Fixed In Version: sssd-2.7.3-4.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2116389 Environment:
Last Closed: 2022-11-08 10:51:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2089955    

Description Yongcheng Yang 2022-08-08 13:08:27 UTC
+++ This bug was initially created as a clone of Bug #2116389 +++

Description of problem:
The NFS kerberos regression testcase of Bug 1419280 starts to fail since sssd-2.7.3-1.el8.

The test scenario is just 
1) set up nfs mount with sec=krb5 (any flavor)
2) access a file or directory on the mount with multiple uids simultaneously

And we can use the scripts provided by Frank Sorenson:
https://bugzilla.redhat.com/show_bug.cgi?id=1419280#c1
https://bugzilla.redhat.com/show_bug.cgi?id=1419280#c2


Version-Release number of selected component (if applicable):
since sssd-2.7.3-1.el8 (maybe the nfs/idmap related packages from http://download.eng.bos.redhat.com/brewroot/packages/sssd/2.7.3/1.el8/x86_64/)

start fail  sssd-2.7.3-1.el8 - https://beaker.engineering.redhat.com/jobs/6893306
latest pass sssd-2.7.2-1.el8 - https://beaker.engineering.redhat.com/jobs/6894849


How reproducible:
reliable


Actual results:
After test with NFSv3 and then trying to mount get failed:

[20:16:37 root@ ~~]# mount -t nfs -o nfsvers=4.0,sec=krb5 kvm-02-guest12.hv2.lab.eng.bos.redhat.com:/exportDir-bz1419280 /mnt/nfsmp-bz1419280
mount.nfs: access denied by server while mounting kvm-02-guest12.hv2.lab.eng.bos.redhat.com:/exportDir-bz1419280


Expected results:
test pass


Additional info:
- This regression also happens in rhel9 sssd-2.7.3-1.el9 but behavior is a bit different.
- I'm not sure if this is sssd issue but as the package sssd version matters please help check what update/change is culprit.

Comment 1 Alexey Tikhonov 2022-08-08 13:14:27 UTC
Most probably duplicate of bz 2109451

Comment 2 Yongcheng Yang 2022-08-11 12:41:39 UTC
(In reply to Alexey Tikhonov from comment #1)
> Most probably duplicate of bz 2109451

Unfortunately not.

I have just checked the version sssd-2.7.3-2.el8 and this problem still exists: https://beaker.engineering.redhat.com/jobs/6903892

Comment 6 Alexey Tikhonov 2022-08-17 10:21:19 UTC
Upstream PR: https://github.com/SSSD/sssd/pull/6305

Comment 7 Alexey Tikhonov 2022-08-19 15:37:28 UTC
Pushed PR: https://github.com/SSSD/sssd/pull/6305

* `master`
    * 4ac93d9c5df59cdb7f397b4467f1c1c4822ff757 - CLIENT:MC: pointer to the context mutex shouldn't be touched
    * 579cc0b266d5f8954bc71cfcd3fe68002d681a5f - CLIENT:MC: -1 is more appropriate initial value for fd
* `sssd-2-7`
    * d386e94ef49d95d7305a3e6578e41a2cf61dfc5c - CLIENT:MC: pointer to the context mutex shouldn't be touched
    * 0eae0862069e4bbbdd87b809193fc873f3003cff - CLIENT:MC: -1 is more appropriate initial value for fd

Comment 15 Alexey Tikhonov 2022-08-25 08:47:10 UTC
# strace -r -f mount -v ibm-x3850x5-03-vm-01.rhts.eng.pek2.redhat.com:/exportDir-bz1419280 /mnt_test/ -o sec=krb5
...
[pid 80180]      0.000366 socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP) = 3
[pid 80180]      0.000156 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 80180]      0.000512 connect(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.73.4.115")}, 16) = 0
[pid 80180]      0.000197 getsockname(3, {sa_family=AF_INET, sin_port=htons(36653), sin_addr=inet_addr("10.73.4.115")}, [28 => 16]) = 0
[pid 80180]      0.000131 close(3)      = 0
[pid 80180]      0.000117 write(1, "mount.nfs: trying text-based opt"..., 97mount.nfs: trying text-based options 'sec=krb5,vers=4.2,addr=10.73.4.115,clientaddr=10.73.4.115') = 97
[pid 80180]      0.000116 mount("ibm-x3850x5-03-vm-01.rhts.eng.pek2.redhat.com:/exportDir-bz1419280", "/mnt_test", "nfs", 0, "sec=krb5,vers=4.2,addr=10.73.4.1"...

#strace -r -p 67531   <-- pid of rpc.gssd
[{events=EPOLLIN, data={u32=9, u64=9}}], 32, -1) = 1
    10.570167 read(9, "\10\0\0\0\0\1\0@\0\0\0\0\20\0\0\0clntcf\0\0\0\0\0\0\0\0\0\0", 4096) = 32
     0.000284 inotify_add_watch(9, "nfs/clntcf", IN_CREATE|IN_DELETE) = 61
     0.000273 openat(8, "nfs/clntcf", O_RDONLY) = 386
     0.000271 openat(386, "gssd", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000232 openat(386, "krb5", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000117 close(386)                = 0
     0.000095 read(9, "=\0\0\0\0\1\0\0\0\0\0\0\20\0\0\0idmap\0\0\0\0\0\0\0\0\0\0\0", 4096) = 32
     0.000094 read(9, 0x7ffea8f547f0, 4096) = -1 EAGAIN (Resource temporarily unavailable)
     0.000106 epoll_wait(5, 

# strace -r -f mount
 <unfinished ...>
[pid 80179]     24.966470 <... wait4 resumed>0x7ffd2c573e68, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 80179]      0.000349 --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} ---
[pid 80179]      0.000360 wait4(80180, 0x7ffd2c573e68, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid 80179]      0.003532 --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} ---
[pid 80179]      0.000458 wait4(80180,  <unfinished ...>
[pid 80180]      0.115931 <... mount resumed>) = -1 EACCES (Permission denied)
[pid 80180]      0.000339 --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_KERNEL} ---
[pid 80180]      0.000598 write(2, "mount.nfs: mount(2): Permission "..., 39mount.nfs: mount(2): Permission denied) = 39
[pid 80180]      0.000894 socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP) = 3
[pid 80180]      0.000678 bind(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
[pid 80180]      0.000610 connect(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.73.4.115")}, 16) = 0
[pid 80180]      0.000542 getsockname(3, {sa_family=AF_INET, sin_port=htons(42790), sin_addr=inet_addr("10.73.4.115")}, [28 => 16]) = 0
[pid 80180]      0.000512 close(3)      = 0
[pid 80180]      0.000312 write(1, "mount.nfs: trying text-based opt"..., 110mount.nfs: trying text-based options 'sec=krb5,vers=4,minorversion=1,addr=10.73.4.115,clientaddr=10.73.4.115') = 110
[pid 80180]      0.000231 mount("ibm-x3850x5-03-vm-01.rhts.eng.pek2.redhat.com:/exportDir-bz1419280", "/mnt_test", "nfs", 0, "sec=krb5,vers=4,minorversion=1,a"...

#strace -r -p 67531
[{events=EPOLLIN, data={u32=9, u64=9}}], 32, -1) = 1
    24.962145 read(9, "\10\0\0\0\0\1\0@\0\0\0\0\20\0\0\0clntd0\0\0\0\0\0\0\0\0\0\0", 4096) = 32
     0.000230 inotify_add_watch(9, "nfs/clntd0", IN_CREATE|IN_DELETE) = 62
     0.000627 openat(8, "nfs/clntd0", O_RDONLY) = 386
     0.000157 openat(386, "gssd", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000171 openat(386, "krb5", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000080 close(386)                = 0
     0.000071 read(9, 0x7ffea8f547f0, 4096) = -1 EAGAIN (Resource temporarily unavailable)
     0.000099 epoll_wait(5, [{events=EPOLLIN, data={u32=9, u64=9}}], 32, -1) = 1
     0.003985 read(9, "\5\0\0\0\0\1\0@\0\0\0\0\20\0\0\0clntd1\0\0\0\0\0\0\0\0\0\0", 4096) = 32
     0.000099 inotify_add_watch(9, "nfsd4_cb/clntd1", IN_CREATE|IN_DELETE) = 63
     0.000108 openat(8, "nfsd4_cb/clntd1", O_RDONLY) = 386
     0.000128 openat(386, "gssd", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000093 openat(386, "krb5", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000090 close(386)                = 0
     0.000074 read(9, "\10\0\0\0\0\1\0@\0\0\0\0\20\0\0\0clntd2\0\0\0\0\0\0\0\0\0\0", 4096) = 32
     0.000126 inotify_add_watch(9, "nfs/clntd2", IN_CREATE|IN_DELETE) = 64
     0.000125 openat(8, "nfs/clntd2", O_RDONLY) = 386
     0.000098 openat(386, "gssd", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000092 openat(386, "krb5", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000088 close(386)                = 0
     0.000065 read(9, 0x7ffea8f547f0, 4096) = -1 EAGAIN (Resource temporarily unavailable)
     0.000097 epoll_wait(5, [{events=EPOLLIN, data={u32=9, u64=9}}], 32, -1) = 1
     0.063122 read(9, "@\0\0\0\0\2\0\0\0\0\0\0\20\0\0\0info\0\0\0\0\0\0\0\0\0\0\0\0", 4096) = 32
     0.000070 read(9, "@\0\0\0\0\200\0\0\0\0\0\0\0\0\0\0", 4096) = 16
     0.000076 inotify_rm_watch(9, 64)   = -1 EINVAL (Invalid argument)
     0.000075 read(9, "?\0\0\0\0\2\0\0\0\0\0\0\20\0\0\0info\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 48
     0.000106 inotify_rm_watch(9, 63)   = -1 EINVAL (Invalid argument)
     0.000093 read(9, 0x7ffea8f547f0, 4096) = -1 EAGAIN (Resource temporarily unavailable)
     0.000084 epoll_wait(5, [{events=EPOLLIN, data={u32=9, u64=9}}], 32, -1) = 1
     0.049355 read(9, "=\0\0\0\0\2\0\0\0\0\0\0\20\0\0\0idmap\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 96
     0.000066 read(9, 0x7ffea8f547f0, 4096) = -1 EAGAIN (Resource temporarily unavailable)
     0.000077 epoll_wait(5, [{events=EPOLLIN, data={u32=9, u64=9}}], 32, -1) = 1
     0.000085 read(9, ">\0\0\0\0\2\0\0\0\0\0\0\20\0\0\0info\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 64
     0.000084 read(9, 0x7ffea8f547f0, 4096) = -1 EAGAIN (Resource temporarily unavailable)
     0.000054 epoll_wait(5, [{events=EPOLLIN, data={u32=9, u64=9}}], 32, -1) = 1
     0.000347 read(9, "=\0\0\0\0\200\0\0\0\0\0\0\0\0\0\0>\0\0\0\0\200\0\0\0\0\0\0\0\0\0\0", 4096) = 32
     0.000074 inotify_rm_watch(9, 61)   = -1 EINVAL (Invalid argument)
     0.000074 inotify_rm_watch(9, 62)   = -1 EINVAL (Invalid argument)
     0.000101 read(9, 0x7ffea8f547f0, 4096) = -1 EAGAIN (Resource temporarily unavailable)
     0.000073 epoll_wait(5, [{events=EPOLLIN, data={u32=9, u64=9}}], 32, -1) = 1
     0.004998 read(9, "\10\0\0\0\0\1\0@\0\0\0\0\20\0\0\0clntd3\0\0\0\0\0\0\0\0\0\0", 4096) = 32
     0.000117 inotify_add_watch(9, "nfs/clntd3", IN_CREATE|IN_DELETE) = 65
     0.000134 openat(8, "nfs/clntd3", O_RDONLY) = 386
     0.000108 openat(386, "gssd", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000126 openat(386, "krb5", O_RDWR|O_NONBLOCK) = -1 EMFILE (Too many open files)
     0.000095 close(386)                = 0
     0.000076 read(9, "A\0\0\0\0\1\0\0\0\0\0\0\20\0\0\0idmap\0\0\0\0\0\0\0\0\0\0\0", 4096) = 32
     0.000077 read(9, 0x7ffea8f547f0, 4096) = -1 EAGAIN (Resource temporarily unavailable)
     0.000072 epoll_wait(5, 



 lsof | grep rpc\.gssd | wc -l
2117

Comment 25 errata-xmlrpc 2022-11-08 10:51:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (sssd bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:7739