Bug 2343514 - [NFS-Ganesha] NFS-Ganesha crashes at nfs_rpc_valid_MNT after enabling QoS configurations.
Summary: [NFS-Ganesha] NFS-Ganesha crashes at nfs_rpc_valid_MNT after enabling QoS con...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: NFS-Ganesha
Version: 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 8.0z3
Assignee: Deeraj Patil
QA Contact: Manisha Saini
Rivka Pollack
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2025-02-03 12:13 UTC by Manisha Saini
Modified: 2025-04-07 15:26 UTC (History)
7 users (show)

Fixed In Version: nfs-ganesha-6.5-5.el9cp
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2025-04-07 15:26:20 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-10547 0 None None None 2025-02-03 12:13:56 UTC
Red Hat Product Errata RHSA-2025:3635 0 None None None 2025-04-07 15:26:23 UTC

Description Manisha Saini 2025-02-03 12:13:22 UTC
Description of problem:
===================

NFS-Ganesha crashes at nfs_rpc_valid_MNT when QoS configurations are enabled using ceph-mgr commands at both the global and export levels.

================
For more information about this security protection see the
"Auto-loading safe path" section in the GDB manual.  E.g., run from the shell:
	info "(gdb)Auto-loading safe path"

warning: Unable to find libthread_db matching inferior's thread library, thread debugging will not be available.
Core was generated by `/usr/bin/ganesha.nfsd -F -L STDERR -N NIV_EVENT'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fd93facf225 in nfs_rpc_valid_MNT (req=0x55dc2fbacf00) at /usr/src/debug/nfs-ganesha-6.5-1.2.el9cp.x86_64/src/MainNFSD/nfs_worker_thread.c:1778
1778					reqdata->funcdesc =
[Current thread is 1 (LWP 98)]
(gdb) bt
#0  0x00007fd93facf225 in nfs_rpc_valid_MNT (req=0x55dc2fbacf00) at /usr/src/debug/nfs-ganesha-6.5-1.2.el9cp.x86_64/src/MainNFSD/nfs_worker_thread.c:1778
#1  nfs_rpc_valid_MNT (req=0x55dc2fbacf00) at /usr/src/debug/nfs-ganesha-6.5-1.2.el9cp.x86_64/src/MainNFSD/nfs_worker_thread.c:1751
#2  0x00007fd8e9ffa354 in ?? ()
#3  0x00007fd93fc4d460 in PSEUDOFS () from /lib64/libganesha_nfsd.so.6.5
#4  0x00007fd93fbc4973 in gssd_get_single_krb5_cred (context=0x7fd93facf180 <nfs_rpc_valid_NLM+272>, kt=<optimized out>, ple=0x7fd93004fe50, nocache=0) at /usr/src/debug/nfs-ganesha-6.5-1.2.el9cp.x86_64/src/RPCAL/gss_credcache.c:280
#5  0x0000000000000000 in ?? ()
====================


[ceph: root@ceph-testbuild-nfs-lv4bo1-node1-installer /]# ceph orch ps | grep nfs.nfs
nfs.nfsganesha.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh  ceph-testbuild-nfs-lv4bo1-node2            *:2049            error            9m ago   3h        -        -  <unknown>                                  <unknown>     <unknown>
[ceph: root@ceph-testbuild-nfs-lv4bo1-node1-installer /]#



Version-Release number of selected component (if applicable):
====================

# ceph --version
ceph version 19.2.0-72.0.TEST.ganeshafeatures001.el9cp (57b8fbc43786fe66ed799cc72caba9d999b846b4) squid (stable)

# rpm -qa | grep nfs
libnfsidmap-2.5.4-27.el9.x86_64
nfs-utils-2.5.4-27.el9.x86_64
nfs-ganesha-selinux-6.5-1.4.el9cp.noarch
nfs-ganesha-6.5-1.4.el9cp.x86_64
nfs-ganesha-ceph-6.5-1.4.el9cp.x86_64
nfs-ganesha-rados-grace-6.5-1.4.el9cp.x86_64
nfs-ganesha-rados-urls-6.5-1.4.el9cp.x86_64
nfs-ganesha-rgw-6.5-1.4.el9cp.x86_64
nfs-ganesha-utils-6.5-1.4.el9cp.x86_64
[ceph: root@ceph-testbuild-nfs-lv4bo1-node1-installer /]#


How reproducible:
===========
1/1


Steps to Reproduce:
=================

1.Deploy the Ganesha cluster

[ceph: root@ceph-testbuild-nfs-lv4bo1-node1-installer /]# ceph nfs cluster create nfsganesha "ceph-testbuild-nfs-lv4bo1-node2"


2.Create the NFS export

[ceph: root@ceph-testbuild-nfs-lv4bo1-node1-installer /]#  ceph nfs export create cephfs nfsganesha /ganeshavol1 cephfs --path=/volumes/ganeshagroup/ganesha1/620237c6-08c9-4e67-a352-b3a576054df7
{
  "bind": "/ganeshavol1",
  "cluster": "nfsganesha",
  "fs": "cephfs",
  "mode": "RW",
  "path": "/volumes/ganeshagroup/ganesha1/620237c6-08c9-4e67-a352-b3a576054df7"
}

3.Enable the Qos at the global level

[ceph: root@ceph-testbuild-nfs-lv4bo1-node1-installer /]# ceph nfs cluster qos enable bandwidth_control nfsganesha PerShare --max_export_write_bw 10MB --max_export_read_bw 20MB
[
  "QOS bandwidth control has been successfully enabled. If the qos_type is changed during this process, ensure that the bandwidth values for all exports are updated accordingly."
]
[ceph: root@ceph-testbuild-nfs-lv4bo1-node1-installer /]#  ceph nfs cluster qos get nfsganesha
{
  "combined_rw_bw_control": false,
  "enable_bw_control": true,
  "enable_qos": true,
  "max_export_read_bw": "20.0MB",
  "max_export_write_bw": "10.0MB",
  "qos_type": "PerShare"
}

4. Mount the export of client via 4.2 and "touch" a file

5. Enable the QoS configurations at export level

[ceph: root@ceph-testbuild-nfs-lv4bo1-node1-installer /]# ceph nfs export qos enable bandwidth_control nfsganesha /ganeshavol1 --max_export_write_bw=1000MB --max_export_read_bw=20MB

[ceph: root@ceph-testbuild-nfs-lv4bo1-node1-installer /]# ceph nfs export info nfsganesha /ganeshavol1
{
  "access_type": "RW",
  "clients": [],
  "cluster_id": "nfsganesha",
  "export_id": 1,
  "fsal": {
    "cmount_path": "/",
    "fs_name": "cephfs",
    "name": "CEPH",
    "user_id": "nfs.nfsganesha.cephfs.2c1043d4"
  },
  "path": "/volumes/ganeshagroup/ganesha1/620237c6-08c9-4e67-a352-b3a576054df7",
  "protocols": [
    3,
    4
  ],
  "pseudo": "/ganeshavol1",
  "qos_block": {
    "combined_rw_bw_control": false,
    "enable_bw_control": true,
    "enable_qos": true,
    "max_export_read_bw": "20.0MB",
    "max_export_write_bw": "1.0GB"
  },
  "security_label": true,
  "squash": "none",
  "transports": [
    "TCP"
  ]
}

6.Create file on Mount point using dd command

[root@ceph-testbuild-nfs-lv4bo1-node7 ganesha]# dd if=/dev/urandom of=/mnt/ganesha/file1 bs=1M count=1000

Actual results:
=============
Ganesha crashed and dumped core's

[root@ceph-testbuild-nfs-lv4bo1-node2 coredump]# ls
'core.ganesha\x2enfsd.0.eb1f05eade244b029abdf21edd512294.40413.1738582418000000.zst'
'core.ganesha\x2enfsd.0.eb1f05eade244b029abdf21edd512294.96108.1738582431000000.zst'
'core.ganesha\x2enfsd.0.eb1f05eade244b029abdf21edd512294.96265.1738582445000000.zst'
'core.ganesha\x2enfsd.0.eb1f05eade244b029abdf21edd512294.96409.1738582458000000.zst'
'core.ganesha\x2enfsd.0.eb1f05eade244b029abdf21edd512294.96567.1738582472000000.zst'
'core.ganesha\x2enfsd.0.eb1f05eade244b029abdf21edd512294.96800.1738582485000000.zst'


Expected results:
===========
Ganesha should not crash

Additional info:
=============

Feb 03 11:34:44 ceph-testbuild-nfs-lv4bo1-node2 ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb-nfs-nfsganesha-0-0-ceph-testbuild-nfs-lv4bo1-node2-iyipwh[96796]: 03/02/2025 11:34:44 : epoch 67a0a9d4 : ceph-testbuild-nfs-lv4bo1-node2 : ganesha.nfsd-2[main] nfs_start :NFS STARTUP :EVENT :             NFS SERVER INITIALIZED
Feb 03 11:34:44 ceph-testbuild-nfs-lv4bo1-node2 ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb-nfs-nfsganesha-0-0-ceph-testbuild-nfs-lv4bo1-node2-iyipwh[96796]: 03/02/2025 11:34:44 : epoch 67a0a9d4 : ceph-testbuild-nfs-lv4bo1-node2 : ganesha.nfsd-2[main] nfs_start :NFS STARTUP :EVENT :-------------------------------------------------
Feb 03 11:34:44 ceph-testbuild-nfs-lv4bo1-node2 ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb-nfs-nfsganesha-0-0-ceph-testbuild-nfs-lv4bo1-node2-iyipwh[96796]: 03/02/2025 11:34:44 : epoch 67a0a9d4 : ceph-testbuild-nfs-lv4bo1-node2 : ganesha.nfsd-2[dbus] gsh_dbus_thread :DBUS :CRIT :DBUS not initialized, service thread exiting
Feb 03 11:34:44 ceph-testbuild-nfs-lv4bo1-node2 ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb-nfs-nfsganesha-0-0-ceph-testbuild-nfs-lv4bo1-node2-iyipwh[96796]: 03/02/2025 11:34:44 : epoch 67a0a9d4 : ceph-testbuild-nfs-lv4bo1-node2 : ganesha.nfsd-2[dbus] gsh_dbus_thread :DBUS :EVENT :shutdown
Feb 03 11:34:46 ceph-testbuild-nfs-lv4bo1-node2 systemd-coredump[96862]: Process 96800 (ganesha.nfsd) of user 0 dumped core.

                                                                         Stack trace of thread 61:
                                                                         #0  0x00007efc9756f225 n/a (/usr/lib64/libganesha_nfsd.so.6.5 + 0x65225)
                                                                         #1  0x00000000808a0708 n/a (n/a + 0x0)
                                                                         ELF object binary architecture: AMD x86-64
Feb 03 11:34:46 ceph-testbuild-nfs-lv4bo1-node2 podman[96867]: 2025-02-03 11:34:46.907799733 +0000 UTC m=+0.039335039 container died 302809e0b8d973237162b4e748a26fc55f6be13784c702d2c9ef9c11f12d1825 (image=cp.stg.icr.io/cp/ibm-ceph/ceph-8-rhel9@sha256:7319ad4bbd4030570523d58a0656dbe33ebf9a298dd10f72fb9db26325893531, name=ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb-nfs-nfsganesha-0-0-ceph-testbuild-nfs-lv4bo1-node2-iyipwh, name=ibm-ceph, description=IBM Storage Ceph 8, GIT_REPO=https://github.com/ceph/ceph-container.git, maintainer=Guillaume Abrioux <gabrioux>, vcs-type=git, url=https://access.redhat.com/containers/#/registry.access.redhat.com/ibm-ceph/images/8-73.0.TEST.ganeshafeatures001, io.openshift.expose-services=, GIT_CLEAN=True, com.redhat.component=ibm-ceph-container, io.buildah.version=1.33.8, io.k8s.description=IBM Storage Ceph 8, distribution-scope=public, RELEASE=main, ceph=True, GIT_COMMIT=eadbe5f6c4471e17c1721f9f08dde7964a4f491b, build-date=2025-01-31T23:05:28, CEPH_POINT_RELEASE=, io.k8s.display-name=IBM Storage Ceph 8, vcs-ref=8dc014514b5df6095811d1ad01a9d2c98e222a0e, vendor=Red Hat, Inc., io.openshift.tags=ibm ceph, version=8, GIT_BRANCH=main, com.redhat.license_terms=https://www.redhat.com/agreements, architecture=x86_64, summary=Provides the latest IBM Storage Ceph 8 in a fully featured and supported base image., release=73.0.TEST.ganeshafeatures001)
Feb 03 11:34:46 ceph-testbuild-nfs-lv4bo1-node2 podman[96867]: 2025-02-03 11:34:46.955119515 +0000 UTC m=+0.086654823 container remove 302809e0b8d973237162b4e748a26fc55f6be13784c702d2c9ef9c11f12d1825 (image=cp.stg.icr.io/cp/ibm-ceph/ceph-8-rhel9@sha256:7319ad4bbd4030570523d58a0656dbe33ebf9a298dd10f72fb9db26325893531, name=ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb-nfs-nfsganesha-0-0-ceph-testbuild-nfs-lv4bo1-node2-iyipwh, GIT_REPO=https://github.com/ceph/ceph-container.git, architecture=x86_64, release=73.0.TEST.ganeshafeatures001, CEPH_POINT_RELEASE=, com.redhat.component=ibm-ceph-container, io.buildah.version=1.33.8, io.openshift.expose-services=, vcs-ref=8dc014514b5df6095811d1ad01a9d2c98e222a0e, build-date=2025-01-31T23:05:28, version=8, GIT_CLEAN=True, vendor=Red Hat, Inc., GIT_BRANCH=main, vcs-type=git, GIT_COMMIT=eadbe5f6c4471e17c1721f9f08dde7964a4f491b, com.redhat.license_terms=https://www.redhat.com/agreements, io.k8s.description=IBM Storage Ceph 8, url=https://access.redhat.com/containers/#/registry.access.redhat.com/ibm-ceph/images/8-73.0.TEST.ganeshafeatures001, RELEASE=main, summary=Provides the latest IBM Storage Ceph 8 in a fully featured and supported base image., distribution-scope=public, io.k8s.display-name=IBM Storage Ceph 8, name=ibm-ceph, description=IBM Storage Ceph 8, maintainer=Guillaume Abrioux <gabrioux>, io.openshift.tags=ibm ceph, ceph=True)
Feb 03 11:34:46 ceph-testbuild-nfs-lv4bo1-node2 systemd[1]: ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh.service: Main process exited, code=exited, status=139/n/a
Feb 03 11:34:47 ceph-testbuild-nfs-lv4bo1-node2 systemd[1]: ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh.service: Failed with result 'exit-code'.
Feb 03 11:34:47 ceph-testbuild-nfs-lv4bo1-node2 systemd[1]: ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh.service: Consumed 1.437s CPU time.
Feb 03 11:34:57 ceph-testbuild-nfs-lv4bo1-node2 systemd[1]: ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh.service: Scheduled restart job, restart counter is at 6.
Feb 03 11:34:57 ceph-testbuild-nfs-lv4bo1-node2 systemd[1]: Stopped Ceph nfs.nfsganesha.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh for d4a5434a-e20b-11ef-8c59-fa163ea56efb.
Feb 03 11:34:57 ceph-testbuild-nfs-lv4bo1-node2 systemd[1]: ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh.service: Consumed 1.437s CPU time.
Feb 03 11:34:57 ceph-testbuild-nfs-lv4bo1-node2 systemd[1]: ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh.service: Start request repeated too quickly.
Feb 03 11:34:57 ceph-testbuild-nfs-lv4bo1-node2 systemd[1]: ceph-d4a5434a-e20b-11ef-8c59-fa163ea56efb.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh.service: Failed with result 'exit-code'.
Feb 03 11:34:57 ceph-testbuild-nfs-lv4bo1-node2 systemd[1]: Failed to start Ceph nfs.nfsganesha.0.0.ceph-testbuild-nfs-lv4bo1-node2.iyipwh for d4a5434a-e20b-11ef-8c59-fa163ea56efb.
[root@ceph-testbuild-nfs-lv4bo1-node2 coredump]#

Comment 11 errata-xmlrpc 2025-04-07 15:26:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2025:3635


Note You need to log in before you can comment on or make changes to this bug.