Description of problem: ======================= Ceph + NFS cluster was upgraded from RHCS 7.1 to RHCS 8.0. Before upgrade ============ On RHCS 7.1 cluster, 2000 NFS exports were present and IO's were triggered on those exports before upgrade Post Upgrade to RHCS 8.0 ============= The ceph cluster health was "HEALTHY" post upgrade. Deleted all the 2000 exports, subvolumes and subvolume group ans deleted the NFS cluster. Created fresh NFS cluster and created 1 subvolume group and 2000 sunvolumes. Out of the created subvolume, tried Creating NFS exports. Post 13th export, export creation failed with below messages. ========== [ceph: root@cali013 /]# for i in $(seq 1 20); do path=$(ceph fs subvolume getpath cephfsfilesys subvol$i --group_name nfssubgroup);ceph nfs export create cephfs cephfs-nfs /nfs_vol$i cephfsfilesys --path="$path";done { "bind": "/nfs_vol1", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol1/8057fde6-242f-4aff-9713-365c42e00327" } { "bind": "/nfs_vol2", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol2/e6978e9f-5d51-4708-b59f-605da646cca4" } { "bind": "/nfs_vol3", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol3/6693c4f9-0cbd-448a-a550-cb6c83234f2a" } { "bind": "/nfs_vol4", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol4/5a96fdb2-8248-42e5-b490-d2fc5307f507" } { "bind": "/nfs_vol5", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol5/244ef2cd-5239-4d48-afd9-9e7f0a7e732d" } { "bind": "/nfs_vol6", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol6/54615a81-69fb-479b-859b-df3f5d92b69a" } { "bind": "/nfs_vol7", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol7/5433082c-feb1-4d68-835f-efdd521afa12" } { "bind": "/nfs_vol8", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol8/33e1ff5d-77f1-4849-8f3f-1603a1233217" } { "bind": "/nfs_vol9", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol9/f7e8987b-06d4-46fc-b74b-a04a7cdf0bb2" } { "bind": "/nfs_vol10", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol10/ed4e734c-875c-4dd3-ac1b-b259a52c861a" } { "bind": "/nfs_vol11", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol11/20732c4a-dd5b-422c-9582-9820d680426d" } { "bind": "/nfs_vol12", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol12/c7777d62-ed5c-432c-acda-8000734021a7" } { "bind": "/nfs_vol13", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol13/86e12864-516a-4b02-a9f7-a69b410450fd" } Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 ====================== # ceph nfs cluster info cephfs-nfs { "cephfs-nfs": { "backend": [ { "hostname": "cali015", "ip": "10.8.130.15", "port": 12049 }, { "hostname": "cali016", "ip": "10.8.130.16", "port": 12049 } ], "monitor_port": 9049, "port": 2049, "virtual_ip": "10.8.130.236" } } ========= [ceph: root@cali013 /]# ceph fs subvolume getpath cephfsfilesys subvol14 --group_name nfssubgroup /volumes/nfssubgroup/subvol14/02cfb612-35da-4f6f-be02-6d1a9e168183 [ceph: root@cali013 /]# ceph nfs export create cephfs cephfs-nfs /nfs_vol14 cephfsfilesys --path="/volumes/nfssubgroup/subvol14/02cfb612-35da-4f6f-be02-6d1a9e168183" Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 [ceph: root@cali013 /]# ceph fs subvolume getpath cephfsfilesys subvol15 --group_name nfssubgroup /volumes/nfssubgroup/subvol15/e2639288-138d-48ca-8cc1-fbeec33266bd [ceph: root@cali013 /]# ceph nfs export create cephfs cephfs-nfs /nfs_vol15 cephfsfilesys --path="/volumes/nfssubgroup/subvol15/e2639288-138d-48ca-8cc1-fbeec33266bd" Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 ============ ganesha.log ======== Aug 05 15:10:28 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:10:28 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload Aug 05 15:10:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:10:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete Aug 05 15:10:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:10:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete Version-Release number of selected component (if applicable): ===================== # rpm -qa | grep nfs libnfsidmap-2.5.4-25.el9.x86_64 nfs-utils-2.5.4-25.el9.x86_64 nfs-ganesha-selinux-5.9-1.el9cp.noarch nfs-ganesha-5.9-1.el9cp.x86_64 nfs-ganesha-rgw-5.9-1.el9cp.x86_64 nfs-ganesha-ceph-5.9-1.el9cp.x86_64 nfs-ganesha-rados-grace-5.9-1.el9cp.x86_64 nfs-ganesha-rados-urls-5.9-1.el9cp.x86_64 # ceph --version ceph version 19.1.0-15.el9cp (f552c890eaaac66497a15d2c04b4fc4cab52f209) squid (rc) How reproducible: =========== 1/1 Steps to Reproduce: =================== 1. Create RHCS 7.1 cluster and deploy NFS cluster 2. Create 2000 exports on 2000 subvolume 3. Mount all the exports on 100 clients and run IO's 4. Upgrade the cluster to RHCS 8.0 5. Delete the exports,subvolume and NFS-Ganesha cluster 6. Recreate the NFS ganesha cluster and try creating 2000 exports on 2000 subvolumes Actual results: ============ Export creation starts failing after 13th export Expected results: =========== Export creation should be successful Additional info: ====== Tested the same on fresh cephfs filesystem volume. The test still fails.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:10216