Bug 2302911 - [RHCS 8.0][NFS-Ganesha] On the upgraded cluster, the export creation is failing with "Error EPERM: Failed to update caps"
Summary: [RHCS 8.0][NFS-Ganesha] On the upgraded cluster, the export creation is faili...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Cephadm
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 8.0
Assignee: avan
QA Contact: Manisha Saini
URL:
Whiteboard:
Depends On:
Blocks: 2317218
TreeView+ depends on / blocked
 
Reported: 2024-08-05 15:34 UTC by Manisha Saini
Modified: 2024-11-25 09:05 UTC (History)
6 users (show)

Fixed In Version: ceph-19.1.0-61.el9cp
Doc Type: Enhancement
Doc Text:
.Exports sharing the same FSAL block have a single Ceph user client linked to them Previously, on an upgraded cluster, the export creation failed with the "Error EPERM: Failed to update caps" message. With this enhancement, the user key generation is modified when creating an export so that any exports that share the same Ceph File System Abstraction Layer (FSAL) block will have only a single Ceph user client linked to them. This enhancement also prevents memory consumption issues in NFS Ganesha.
Clone Of:
Environment:
Last Closed: 2024-11-25 09:05:07 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph pull 54277 0 None open mgr/nfs: generate user_id & access_key for apply_export(CephFS) 2024-08-29 06:43:49 UTC
Red Hat Issue Tracker RHCEPH-9616 0 None None None 2024-08-29 06:46:00 UTC
Red Hat Product Errata RHBA-2024:10216 0 None None None 2024-11-25 09:05:12 UTC

Description Manisha Saini 2024-08-05 15:34:09 UTC
Description of problem:
=======================

Ceph + NFS cluster was upgraded from RHCS 7.1 to RHCS 8.0.

Before upgrade
============
On RHCS 7.1 cluster, 2000 NFS exports were present and IO's were triggered on those exports before upgrade

Post Upgrade to RHCS 8.0
=============
The ceph cluster health was "HEALTHY" post upgrade.
Deleted all the 2000 exports, subvolumes and subvolume group ans deleted the NFS cluster.
Created fresh NFS cluster and created 1 subvolume group and 2000 sunvolumes.
Out of the created subvolume, tried Creating NFS exports.
Post 13th export, export creation failed with below messages.


==========
[ceph: root@cali013 /]# for i in $(seq 1 20); do path=$(ceph fs subvolume getpath cephfsfilesys subvol$i --group_name nfssubgroup);ceph nfs export create cephfs cephfs-nfs /nfs_vol$i cephfsfilesys --path="$path";done
{
  "bind": "/nfs_vol1",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol1/8057fde6-242f-4aff-9713-365c42e00327"
}
{
  "bind": "/nfs_vol2",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol2/e6978e9f-5d51-4708-b59f-605da646cca4"
}
{
  "bind": "/nfs_vol3",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol3/6693c4f9-0cbd-448a-a550-cb6c83234f2a"
}
{
  "bind": "/nfs_vol4",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol4/5a96fdb2-8248-42e5-b490-d2fc5307f507"
}
{
  "bind": "/nfs_vol5",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol5/244ef2cd-5239-4d48-afd9-9e7f0a7e732d"
}
{
  "bind": "/nfs_vol6",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol6/54615a81-69fb-479b-859b-df3f5d92b69a"
}
{
  "bind": "/nfs_vol7",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol7/5433082c-feb1-4d68-835f-efdd521afa12"
}
{
  "bind": "/nfs_vol8",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol8/33e1ff5d-77f1-4849-8f3f-1603a1233217"
}
{
  "bind": "/nfs_vol9",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol9/f7e8987b-06d4-46fc-b74b-a04a7cdf0bb2"
}
{
  "bind": "/nfs_vol10",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol10/ed4e734c-875c-4dd3-ac1b-b259a52c861a"
}
{
  "bind": "/nfs_vol11",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol11/20732c4a-dd5b-422c-9582-9820d680426d"
}
{
  "bind": "/nfs_vol12",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol12/c7777d62-ed5c-432c-acda-8000734021a7"
}
{
  "bind": "/nfs_vol13",
  "cluster": "cephfs-nfs",
  "fs": "cephfsfilesys",
  "mode": "RW",
  "path": "/volumes/nfssubgroup/subvol13/86e12864-516a-4b02-a9f7-a69b410450fd"
}
Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14
Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14
Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14
Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14
Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14
Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14
Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14
======================

# ceph nfs cluster info cephfs-nfs
{
  "cephfs-nfs": {
    "backend": [
      {
        "hostname": "cali015",
        "ip": "10.8.130.15",
        "port": 12049
      },
      {
        "hostname": "cali016",
        "ip": "10.8.130.16",
        "port": 12049
      }
    ],
    "monitor_port": 9049,
    "port": 2049,
    "virtual_ip": "10.8.130.236"
  }
}

=========
[ceph: root@cali013 /]# ceph fs subvolume getpath cephfsfilesys subvol14 --group_name nfssubgroup
/volumes/nfssubgroup/subvol14/02cfb612-35da-4f6f-be02-6d1a9e168183
[ceph: root@cali013 /]# ceph nfs export create cephfs cephfs-nfs /nfs_vol14 cephfsfilesys --path="/volumes/nfssubgroup/subvol14/02cfb612-35da-4f6f-be02-6d1a9e168183"
Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14
[ceph: root@cali013 /]# ceph fs subvolume getpath cephfsfilesys subvol15 --group_name nfssubgroup
/volumes/nfssubgroup/subvol15/e2639288-138d-48ca-8cc1-fbeec33266bd
[ceph: root@cali013 /]# ceph nfs export create cephfs cephfs-nfs /nfs_vol15 cephfsfilesys --path="/volumes/nfssubgroup/subvol15/e2639288-138d-48ca-8cc1-fbeec33266bd"
Error EPERM: Failed to update caps for nfs.cephfs-nfs.14: updated caps for client.nfs.cephfs-nfs.14
============



ganesha.log
========
Aug 05 15:10:28 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:10:28 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload
Aug 05 15:10:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:10:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete
Aug 05 15:10:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:10:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reclaim_reset :FSAL :EVENT :start_reclaim failed: No such file or directory
Aug 05 15:12:58 cali015 ceph-4e687a60-638e-11ee-8772-b49691cee574-nfs-cephfs-nfs-0-0-cali015-jkaymm[2490935]: 05/08/2024 15:12:58 : epoch 66b03e84 : cali015 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete

Version-Release number of selected component (if applicable):
=====================
# rpm -qa | grep nfs
libnfsidmap-2.5.4-25.el9.x86_64
nfs-utils-2.5.4-25.el9.x86_64
nfs-ganesha-selinux-5.9-1.el9cp.noarch
nfs-ganesha-5.9-1.el9cp.x86_64
nfs-ganesha-rgw-5.9-1.el9cp.x86_64
nfs-ganesha-ceph-5.9-1.el9cp.x86_64
nfs-ganesha-rados-grace-5.9-1.el9cp.x86_64
nfs-ganesha-rados-urls-5.9-1.el9cp.x86_64

# ceph --version
ceph version 19.1.0-15.el9cp (f552c890eaaac66497a15d2c04b4fc4cab52f209) squid (rc)


How reproducible:
===========
1/1


Steps to Reproduce:
===================
1. Create RHCS 7.1 cluster and deploy NFS cluster
2. Create 2000 exports on 2000 subvolume 
3. Mount all the exports on 100 clients and run IO's
4. Upgrade the cluster to RHCS 8.0
5. Delete the exports,subvolume and NFS-Ganesha cluster
6. Recreate the NFS ganesha cluster and try creating 2000 exports on 2000 subvolumes

Actual results:
============
Export creation starts failing after 13th export


Expected results:
===========
Export creation should be successful


Additional info:
======
Tested the same on fresh cephfs filesystem volume. The test still fails.

Comment 10 errata-xmlrpc 2024-11-25 09:05:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:10216


Note You need to log in before you can comment on or make changes to this bug.