Description of problem: ============ With latest nfs build i.e 5.6-2, cmount_path is mandatory parameter in FSAL block. As a result, the export is created in ceph orchestrator but the mount is failing on the client as at the ganesha backend export creation fails with "CONFIG :CRIT :Could not create export for (/ganesha1) to (/). 1 (export create error, missing mandatory param) errors found block EXPORT" This is causing inconsistent behaviour within ceph orchestrator and ganesha backend. # ceph nfs export ls nfsganesha --detailed [ { "access_type": "RW", "clients": [], "cluster_id": "nfsganesha", "export_id": 1, "fsal": { "fs_name": "cephfs", "name": "CEPH", "user_id": "nfs.nfsganesha.1" }, "path": "/", "protocols": [ 4 ], "pseudo": "/ganesha1", "security_label": true, "squash": "none", "transports": [ "TCP" ] } ] ganesha.log =========== Oct 26 19:25:02 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:02 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[reaper] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 Oct 26 19:25:02 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:02 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[reaper] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend Oct 26 19:25:02 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:02 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[reaper] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0) Oct 26 19:25:05 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:05 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[reaper] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] mdcache_fsal_create_export :FSAL :MAJ :Failed to call create_export on underlying FSAL Ceph Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] fsal_cfg_commit :CONFIG :CRIT :Could not create export for (/ganesha1) to (/) Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): Mandatory field, cmount_path is missing from block (FSAL) Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): 1 errors while processing parameters for FSAL Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): Errors found in configuration block FSAL Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): 1 validation errors in block FSAL Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): Errors processing block (FSAL) Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":1): 1 errors while processing parameters for EXPORT Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":1): Errors processing block (EXPORT) Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":1): 1 (export create error, missing mandatory param) errors found block EXPORT Version-Release number of selected component (if applicable): ================ ]# rpm -qa | grep nfs libnfsidmap-2.5.4-18.el9.x86_64 nfs-utils-2.5.4-18.el9.x86_64 nfs-ganesha-selinux-5.6-2.el9cp.noarch nfs-ganesha-5.6-2.el9cp.x86_64 nfs-ganesha-rgw-5.6-2.el9cp.x86_64 nfs-ganesha-ceph-5.6-2.el9cp.x86_64 nfs-ganesha-rados-grace-5.6-2.el9cp.x86_64 nfs-ganesha-rados-urls-5.6-2.el9cp.x86_64 How reproducible: =========== 2/2 Steps to Reproduce: ============ 1.Setup ganesha on ceph ]# ceph nfs cluster info nfsganesha { "nfsganesha": { "backend": [ { "hostname": "ceph-doremon-vuy1kt-node2", "ip": "10.0.204.144", "port": 2049 }, { "hostname": "ceph-doremon-vuy1kt-node3", "ip": "10.0.204.220", "port": 2049 } ], "virtual_ip": null } } 2. Create a cephfs filesystem # ceph fs volume ls [ { "name": "cephfs" } ] 3. Create an NFS export using cephfs filesystem # ceph nfs export create cephfs nfsganesha /ganesha1 cephfs --path=/ { "bind": "/ganesha1", "cluster": "nfsganesha", "fs": "cephfs", "mode": "RW", "path": "/" } 4. Mount the volume on client # mount -t nfs -o vers=4 10.0.204.144:/ganesha1 /mnt/ganesha1/ mount.nfs: mounting 10.0.204.144:/ganesha1 failed, reason given by server: No such file or directory Actual results: ========== Mount fails as the export was not created by ganesha backend Expected results: ========= Mount should succeed. Additional info:
Oops, I actually made an error, cmount_path should not be mandatory... There are also some other issues. I'm working on a patch now.
OK, I fixed this and a patch has been pushed upstream and downstream. You will still need the orchestration changes to consolidate cephfs clients, but the current Ganesha should run on configurations that were set up the old way.
Hi Frank, We are observing the following behaviour Steps Performed 1. Created nfs cluster on 6.0 build (17.2.6-148.el9cp) 2. Created export and mounted it. 3. Upgraded the setup to 7.0 (18.2.0-104.el9cp) 4. Tried to access the mount point ls /mnt/nfs on 10.0.205.187 timeout 600 2023-10-30 13:57:39,004 (cephci.cephfs_upgrade.cephfs_post_upgrade_validation) [ERROR] - cephci.ceph.ceph.py:1593 - Error 2 during cmd, timeout 600 2023-10-30 13:57:39,005 (cephci.cephfs_upgrade.cephfs_post_upgrade_validation) [ERROR] - cephci.ceph.ceph.py:1594 - ls: cannot access '/mnt/nfs': Stale file handle 5. unmounted and tried mounting it again seeing same issue as above [root@ceph-amk-scale-64mewg-node8 ~]# umount /mnt/nfs/ [root@ceph-amk-scale-64mewg-node8 ~]# mount -t nfs -o port=2049 ceph-amk-scale-64mewg-node6:/export1 /mnt/nfs/ Created symlink /run/systemd/system/remote-fs.target.wants/rpc-statd.service → /usr/lib/systemd/system/rpc-statd.service. mount.nfs: mounting ceph-amk-scale-64mewg-node6:/export1 failed, reason given by server: No such file or directory Regards, Amarnath
(In reply to Frank Filz from comment #3) > OK, I fixed this and a patch has been pushed upstream and downstream. > > You will still need the orchestration changes to consolidate cephfs clients, > but the current Ganesha should run on configurations that were set up the > old way. Hi Frank, Can this be now moved to ON_QA?
(In reply to Amarnath from comment #4) > Hi Frank, > > We are observing the following behaviour > > Steps Performed > 1. Created nfs cluster on 6.0 build (17.2.6-148.el9cp) > 2. Created export and mounted it. > 3. Upgraded the setup to 7.0 (18.2.0-104.el9cp) > 4. Tried to access the mount point > > ls /mnt/nfs on 10.0.205.187 timeout 600 > 2023-10-30 13:57:39,004 > (cephci.cephfs_upgrade.cephfs_post_upgrade_validation) [ERROR] - > cephci.ceph.ceph.py:1593 - Error 2 during cmd, timeout 600 > 2023-10-30 13:57:39,005 > (cephci.cephfs_upgrade.cephfs_post_upgrade_validation) [ERROR] - > cephci.ceph.ceph.py:1594 - ls: cannot access '/mnt/nfs': Stale file handle > > 5. unmounted and tried mounting it again seeing same issue as above > [root@ceph-amk-scale-64mewg-node8 ~]# umount /mnt/nfs/ > [root@ceph-amk-scale-64mewg-node8 ~]# mount -t nfs -o port=2049 > ceph-amk-scale-64mewg-node6:/export1 /mnt/nfs/ > Created symlink /run/systemd/system/remote-fs.target.wants/rpc-statd.service > → /usr/lib/systemd/system/rpc-statd.service. > mount.nfs: mounting ceph-amk-scale-64mewg-node6:/export1 failed, reason > given by server: No such file or directory > > > Regards, > Amarnath Until the fix I posted yesterday gets in, old clusters won't work with the new Ganesha. Once that fix is in, they should work. But NOTE: we aren't providing a migration path between old exports and new, so if you have old exports (that don't have cmount_path specified), each export will have it's own cephfs client (and attendant memory consumption). The migration path is to remove old exports and create new ones.
(In reply to Frank Filz from comment #6) > > Until the fix I posted yesterday gets in, old clusters won't work with the > new Ganesha. Once that fix is in, they should work. > The nfs-ganesha fixes from yesterday are now in downstream's nfs-ganesha-5.6-3.el9cp, and included in the container, rhceph-container-7-113. https://gitlab.cee.redhat.com/ceph/nfs-ganesha/-/commits/ceph-7.0-rhel-patches : * CEPH: Fix up cmount_path * CEPH: Currently client_oc true is broken, force it to false * V5.6 tag Thomas
I think this is ready to verify?
Verified this with # rpm -qa | grep nfs libnfsidmap-2.5.4-20.el9.x86_64 nfs-utils-2.5.4-20.el9.x86_64 nfs-ganesha-selinux-5.6-4.el9cp.noarch nfs-ganesha-5.6-4.el9cp.x86_64 nfs-ganesha-rgw-5.6-4.el9cp.x86_64 nfs-ganesha-ceph-5.6-4.el9cp.x86_64 nfs-ganesha-rados-grace-5.6-4.el9cp.x86_64 nfs-ganesha-rados-urls-5.6-4.el9cp.x86_64 Export creation and mount os successful. Moving this BZ to verified state.
This was a bug with the cmount_path addition for https://bugzilla.redhat.com/show_bug.cgi?id=2239769 I'll add a doc text there, but this doesn't need a separate doc text.
Correct BZ for cmount_path is https://bugzilla.redhat.com/show_bug.cgi?id=2236325
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:7780