Bug 2246453 - Export creation is failing with -- Failed to call create_export on underlying FSAL Ceph. Mandatory field, cmount_path is missing from block (FSAL)"
Summary: Export creation is failing with -- Failed to call create_export on underlying...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: NFS-Ganesha
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 7.0
Assignee: Frank Filz
QA Contact: Manisha Saini
Rivka Pollack
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-10-26 23:47 UTC by Manisha Saini
Modified: 2023-12-13 15:24 UTC (History)
9 users (show)

Fixed In Version: nfs-ganesha-5.6-4.el9cp; rhceph-container-7-130
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-12-13 15:24:35 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-7806 0 None None None 2023-10-26 23:49:44 UTC
Red Hat Product Errata RHBA-2023:7780 0 None None None 2023-12-13 15:24:39 UTC

Description Manisha Saini 2023-10-26 23:47:18 UTC
Description of problem:
============

With latest nfs build i.e 5.6-2, cmount_path is mandatory parameter in FSAL block.
As a result, the export is created in ceph orchestrator but the mount is failing on the client as at the ganesha backend export creation fails with "CONFIG :CRIT :Could not create export for (/ganesha1) to (/). 
1 (export create error, missing mandatory param) errors found block EXPORT"

This is causing inconsistent behaviour within ceph orchestrator and ganesha backend.


# ceph nfs export ls nfsganesha --detailed
[
  {
    "access_type": "RW",
    "clients": [],
    "cluster_id": "nfsganesha",
    "export_id": 1,
    "fsal": {
      "fs_name": "cephfs",
      "name": "CEPH",
      "user_id": "nfs.nfsganesha.1"
    },
    "path": "/",
    "protocols": [
      4
    ],
    "pseudo": "/ganesha1",
    "security_label": true,
    "squash": "none",
    "transports": [
      "TCP"
    ]
  }
]



ganesha.log
===========
Oct 26 19:25:02 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:02 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[reaper] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90
Oct 26 19:25:02 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:02 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[reaper] nfs_start_grace :STATE :EVENT :grace reload client info completed from backend
Oct 26 19:25:02 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:02 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[reaper] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(0)
Oct 26 19:25:05 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:05 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[reaper] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] sigmgr_thread :MAIN :EVENT :SIGHUP_HANDLER: Received SIGHUP.... initiating export list reload
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] mdcache_fsal_create_export :FSAL :MAJ :Failed to call create_export on underlying FSAL Ceph
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] fsal_cfg_commit :CONFIG :CRIT :Could not create export for (/ganesha1) to (/)
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] reread_exports :CONFIG :EVENT :Reread exports complete
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): Mandatory field, cmount_path is missing from block (FSAL)
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): 1 errors while processing parameters for FSAL
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): Errors found in configuration block FSAL
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): 1 validation errors in block FSAL
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":2): Errors processing block (FSAL)
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":1): 1 errors while processing parameters for EXPORT
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":1): Errors processing block (EXPORT)
Oct 26 19:25:57 ceph-doremon-vuy1kt-node2 ceph-b5e47c0c-744b-11ee-9a56-fa163e427c4e-nfs-nfsganesha-0-0-ceph-doremon-vuy1kt-node2-mgunrc[43596]: 26/10/2023 23:25:57 : epoch 653af549 : ceph-doremon-vuy1kt-node2 : ganesha.nfsd-2[sigmgr] config_errs_to_log :CONFIG :CRIT :Config File ("rados://.nfs/nfsganesha/export-1":1): 1 (export create error, missing mandatory param) errors found block EXPORT

Version-Release number of selected component (if applicable):
================
]# rpm -qa | grep nfs
libnfsidmap-2.5.4-18.el9.x86_64
nfs-utils-2.5.4-18.el9.x86_64
nfs-ganesha-selinux-5.6-2.el9cp.noarch
nfs-ganesha-5.6-2.el9cp.x86_64
nfs-ganesha-rgw-5.6-2.el9cp.x86_64
nfs-ganesha-ceph-5.6-2.el9cp.x86_64
nfs-ganesha-rados-grace-5.6-2.el9cp.x86_64
nfs-ganesha-rados-urls-5.6-2.el9cp.x86_64



How reproducible:
===========
2/2

Steps to Reproduce:
============
1.Setup ganesha on ceph 

]# ceph nfs cluster info nfsganesha
{
  "nfsganesha": {
    "backend": [
      {
        "hostname": "ceph-doremon-vuy1kt-node2",
        "ip": "10.0.204.144",
        "port": 2049
      },
      {
        "hostname": "ceph-doremon-vuy1kt-node3",
        "ip": "10.0.204.220",
        "port": 2049
      }
    ],
    "virtual_ip": null
  }
}


2. Create a cephfs filesystem

# ceph fs volume ls
[
    {
        "name": "cephfs"
    }
]


3. Create an NFS export using cephfs filesystem

# ceph nfs export create cephfs nfsganesha /ganesha1 cephfs --path=/
{
  "bind": "/ganesha1",
  "cluster": "nfsganesha",
  "fs": "cephfs",
  "mode": "RW",
  "path": "/"
}

4. Mount the volume on client

# mount -t nfs -o vers=4 10.0.204.144:/ganesha1 /mnt/ganesha1/
mount.nfs: mounting 10.0.204.144:/ganesha1 failed, reason given by server: No such file or directory


Actual results:
==========
Mount fails as the export was not created by ganesha backend


Expected results:
=========
Mount should succeed.



Additional info:

Comment 2 Frank Filz 2023-10-27 23:43:29 UTC
Oops, I actually made an error, cmount_path should not be mandatory... There are also some other issues. I'm working on a patch now.

Comment 3 Frank Filz 2023-10-30 17:02:36 UTC
OK, I fixed this and a patch has been pushed upstream and downstream.

You will still need the orchestration changes to consolidate cephfs clients, but the current Ganesha should run on configurations that were set up the old way.

Comment 4 Amarnath 2023-10-30 19:53:31 UTC
Hi Frank,

We are observing the following behaviour

Steps Performed
1. Created nfs cluster on 6.0 build (17.2.6-148.el9cp) 
2. Created export and mounted it.
3. Upgraded the setup to 7.0 (18.2.0-104.el9cp)
4. Tried to access the mount point

ls /mnt/nfs on 10.0.205.187 timeout 600
2023-10-30 13:57:39,004 (cephci.cephfs_upgrade.cephfs_post_upgrade_validation) [ERROR] - cephci.ceph.ceph.py:1593 - Error 2 during cmd, timeout 600
2023-10-30 13:57:39,005 (cephci.cephfs_upgrade.cephfs_post_upgrade_validation) [ERROR] - cephci.ceph.ceph.py:1594 - ls: cannot access '/mnt/nfs': Stale file handle

5. unmounted and tried mounting it again seeing same issue as above 
[root@ceph-amk-scale-64mewg-node8 ~]# umount /mnt/nfs/
[root@ceph-amk-scale-64mewg-node8 ~]# mount -t nfs -o port=2049 ceph-amk-scale-64mewg-node6:/export1 /mnt/nfs/
Created symlink /run/systemd/system/remote-fs.target.wants/rpc-statd.service → /usr/lib/systemd/system/rpc-statd.service.
mount.nfs: mounting ceph-amk-scale-64mewg-node6:/export1 failed, reason given by server: No such file or directory


Regards,
Amarnath

Comment 5 Manisha Saini 2023-10-31 05:25:05 UTC
(In reply to Frank Filz from comment #3)
> OK, I fixed this and a patch has been pushed upstream and downstream.
> 
> You will still need the orchestration changes to consolidate cephfs clients,
> but the current Ganesha should run on configurations that were set up the
> old way.

Hi Frank,

Can this be now moved to ON_QA?

Comment 6 Frank Filz 2023-10-31 14:29:19 UTC
(In reply to Amarnath from comment #4)
> Hi Frank,
> 
> We are observing the following behaviour
> 
> Steps Performed
> 1. Created nfs cluster on 6.0 build (17.2.6-148.el9cp) 
> 2. Created export and mounted it.
> 3. Upgraded the setup to 7.0 (18.2.0-104.el9cp)
> 4. Tried to access the mount point
> 
> ls /mnt/nfs on 10.0.205.187 timeout 600
> 2023-10-30 13:57:39,004
> (cephci.cephfs_upgrade.cephfs_post_upgrade_validation) [ERROR] -
> cephci.ceph.ceph.py:1593 - Error 2 during cmd, timeout 600
> 2023-10-30 13:57:39,005
> (cephci.cephfs_upgrade.cephfs_post_upgrade_validation) [ERROR] -
> cephci.ceph.ceph.py:1594 - ls: cannot access '/mnt/nfs': Stale file handle
> 
> 5. unmounted and tried mounting it again seeing same issue as above 
> [root@ceph-amk-scale-64mewg-node8 ~]# umount /mnt/nfs/
> [root@ceph-amk-scale-64mewg-node8 ~]# mount -t nfs -o port=2049
> ceph-amk-scale-64mewg-node6:/export1 /mnt/nfs/
> Created symlink /run/systemd/system/remote-fs.target.wants/rpc-statd.service
> → /usr/lib/systemd/system/rpc-statd.service.
> mount.nfs: mounting ceph-amk-scale-64mewg-node6:/export1 failed, reason
> given by server: No such file or directory
> 
> 
> Regards,
> Amarnath

Until the fix I posted yesterday gets in, old clusters won't work with the new Ganesha. Once that fix is in, they should work.

But NOTE: we aren't providing a migration path between old exports and new, so if you have old exports (that don't have cmount_path specified), each export will have it's own cephfs client (and attendant memory consumption). The migration path is to remove old exports and create new ones.

Comment 7 tserlin 2023-10-31 14:58:41 UTC
(In reply to Frank Filz from comment #6)
> 
> Until the fix I posted yesterday gets in, old clusters won't work with the
> new Ganesha. Once that fix is in, they should work.
> 

The nfs-ganesha fixes from yesterday are now in downstream's nfs-ganesha-5.6-3.el9cp, and included in the container, rhceph-container-7-113.

https://gitlab.cee.redhat.com/ceph/nfs-ganesha/-/commits/ceph-7.0-rhel-patches :

* CEPH: Fix up cmount_path
* CEPH: Currently client_oc true is broken, force it to false
* V5.6 tag

Thomas

Comment 8 Frank Filz 2023-11-14 00:40:06 UTC
I think this is ready to verify?

Comment 14 Manisha Saini 2023-11-16 06:04:24 UTC
Verified this with 

# rpm -qa | grep nfs
libnfsidmap-2.5.4-20.el9.x86_64
nfs-utils-2.5.4-20.el9.x86_64
nfs-ganesha-selinux-5.6-4.el9cp.noarch
nfs-ganesha-5.6-4.el9cp.x86_64
nfs-ganesha-rgw-5.6-4.el9cp.x86_64
nfs-ganesha-ceph-5.6-4.el9cp.x86_64
nfs-ganesha-rados-grace-5.6-4.el9cp.x86_64
nfs-ganesha-rados-urls-5.6-4.el9cp.x86_64


Export creation and mount os successful. Moving this BZ to verified state.

Comment 16 Frank Filz 2023-11-21 18:49:43 UTC
This was a bug with the cmount_path addition for https://bugzilla.redhat.com/show_bug.cgi?id=2239769 I'll add a doc text there, but this doesn't need a separate doc text.

Comment 17 Frank Filz 2023-11-21 18:59:34 UTC
Correct BZ for cmount_path is https://bugzilla.redhat.com/show_bug.cgi?id=2236325

Comment 18 errata-xmlrpc 2023-12-13 15:24:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7780


Note You need to log in before you can comment on or make changes to this bug.