.Exports sharing the same FSAL block have a single Ceph user client linked to them
Previously, on an upgraded cluster, the export creation failed with the "Error EPERM: Failed to update caps" message.
With this enhancement, the user key generation is modified when creating an export so that any exports that share the same Ceph File System Abstraction Layer (FSAL) block will have only a single Ceph user client linked to them. This enhancement also prevents memory consumption issues in NFS Ganesha.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2024:10216
Description of problem: ======================= Ceph + NFS cluster was upgraded from RHCS 7.1 to RHCS 8.0. Before upgrade ============ On RHCS 7.1 cluster, 2000 NFS exports were present and IO's were triggered on those exports before upgrade Post Upgrade to RHCS 8.0 ============= The ceph cluster health was "HEALTHY" post upgrade. Deleted all the 2000 exports, subvolumes and subvolume group ans deleted the NFS cluster. Created fresh NFS cluster and created 1 subvolume group and 2000 sunvolumes. Out of the created subvolume, tried Creating NFS exports. Post 13th export, export creation failed with below messages. ========== [ceph: root@cali013 /]# for i in $(seq 1 20); do path=$(ceph fs subvolume getpath cephfsfilesys subvol$i --group_name nfssubgroup);ceph nfs export create cephfs cephfs-nfs /nfs_vol$i cephfsfilesys --path="$path";done { "bind": "/nfs_vol1", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol1/8057fde6-242f-4aff-9713-365c42e00327" } { "bind": "/nfs_vol2", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol2/e6978e9f-5d51-4708-b59f-605da646cca4" } { "bind": "/nfs_vol3", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol3/6693c4f9-0cbd-448a-a550-cb6c83234f2a" } { "bind": "/nfs_vol4", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol4/5a96fdb2-8248-42e5-b490-d2fc5307f507" } { "bind": "/nfs_vol5", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol5/244ef2cd-5239-4d48-afd9-9e7f0a7e732d" } { "bind": "/nfs_vol6", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol6/54615a81-69fb-479b-859b-df3f5d92b69a" } { "bind": "/nfs_vol7", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol7/5433082c-feb1-4d68-835f-efdd521afa12" } { "bind": "/nfs_vol8", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol8/33e1ff5d-77f1-4849-8f3f-1603a1233217" } { "bind": "/nfs_vol9", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol9/f7e8987b-06d4-46fc-b74b-a04a7cdf0bb2" } { "bind": "/nfs_vol10", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol10/ed4e734c-875c-4dd3-ac1b-b259a52c861a" } { "bind": "/nfs_vol11", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol11/20732c4a-dd5b-422c-9582-9820d680426d" } { "bind": "/nfs_vol12", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol12/c7777d62-ed5c-432c-acda-8000734021a7" } { "bind": "/nfs_vol13", "cluster": "cephfs-nfs", "fs": "cephfsfilesys", "mode": "RW", "path": "/volumes/nfssubgroup/subvol13/86e12864-516a-4b02-a9f7-a69b410450fd" } Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 ====================== # ceph nfs cluster info cephfs-nfs { "cephfs-nfs": { "backend": [ { "hostname": "cali015", "ip": "10.8.130.15", "port": 12049 }, { "hostname": "cali016", "ip": "10.8.130.16", "port": 12049 } ], "monitor_port": 9049, "port": 2049, "virtual_ip": "10.8.130.236" } } ========= [ceph: root@cali013 /]# ceph fs subvolume getpath cephfsfilesys subvol14 --group_name nfssubgroup /volumes/nfssubgroup/subvol14/02cfb612-35da-4f6f-be02-6d1a9e168183 [ceph: root@cali013 /]# ceph nfs export create cephfs cephfs-nfs /nfs_vol14 cephfsfilesys --path="/volumes/nfssubgroup/subvol14/02cfb612-35da-4f6f-be02-6d1a9e168183" Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 [ceph: root@cali013 /]# ceph fs subvolume getpath cephfsfilesys subvol15 --group_name nfssubgroup /volumes/nfssubgroup/subvol15/e2639288-138d-48ca-8cc1-fbeec33266bd [ceph: root@cali013 /]# ceph nfs export create cephfs cephfs-nfs /nfs_vol15 cephfsfilesys --path="/volumes/nfssubgroup/subvol15/e2639288-138d-48ca-8cc1-fbeec33266bd" Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14 ============ ganesha.log ======== Aug 05 15:10:28 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:10:28 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload Aug 05 15:10:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:10:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete Aug 05 15:10:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:10:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete Version-Release number of selected component (if applicable): ===================== # rpm -qa | grep nfs libnfsidmap-2.5.4-25.el9.x86_64 nfs-utils-2.5.4-25.el9.x86_64 nfs-ganesha-selinux-5.9-1.el9cp.noarch nfs-ganesha-5.9-1.el9cp.x86_64 nfs-ganesha-rgw-5.9-1.el9cp.x86_64 nfs-ganesha-ceph-5.9-1.el9cp.x86_64 nfs-ganesha-rados-grace-5.9-1.el9cp.x86_64 nfs-ganesha-rados-urls-5.9-1.el9cp.x86_64 # ceph --version ceph version 19.1.0-15.el9cp (f552c890eaaac66497a15d2c04b4fc4cab52f209) squid (rc) How reproducible: =========== 1/1 Steps to Reproduce: =================== 1. Create RHCS 7.1 cluster and deploy NFS cluster 2. Create 2000 exports on 2000 subvolume 3. Mount all the exports on 100 clients and run IO's 4. Upgrade the cluster to RHCS 8.0 5. Delete the exports,subvolume and NFS-Ganesha cluster 6. Recreate the NFS ganesha cluster and try creating 2000 exports on 2000 subvolumes Actual results: ============ Export creation starts failing after 13th export Expected results: =========== Export creation should be successful Additional info: ====== Tested the same on fresh cephfs filesystem volume. The test still fails.