Bug 2362289 - [NFS-Ganesha] Ganesha process crashed at __memcpy_evex_unaligned_erms while running the pynfs test suite vers=4.1
Summary: [NFS-Ganesha] Ganesha process crashed at __memcpy_evex_unaligned_erms while r...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: CephFS
Version: 8.1
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 8.1
Assignee: Venky Shankar
QA Contact: Manisha Saini
URL:
Whiteboard:
: 2362861 (view as bug list)
Depends On: 2363635
Blocks: 2367464 2359508
TreeView+ depends on / blocked
 
Reported: 2025-04-25 11:22 UTC by Manisha Saini
Modified: 2025-06-26 12:31 UTC (History)
13 users (show)

Fixed In Version: nfs-ganesha-6.5-13.el9cp; rhceph-container-8-424
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2025-06-26 12:31:05 UTC
Embargoed:
khiremat: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-11261 0 None None None 2025-04-25 11:23:38 UTC
Red Hat Product Errata RHSA-2025:9775 0 None None None 2025-06-26 12:31:17 UTC

Description Manisha Saini 2025-04-25 11:22:57 UTC
Description of problem:
======

NFS-Ganesha server crashed and dumped core while running the pynfs sanity test suite

--------------------------------
Apr 25 11:09:10 ceph-nfsclusterlive-qrsq9e-node2 ceph-d55360cc-20dd-11f0-9cff-fa163eb24a5f-nfs-cephfs-nfs-0-0-ceph-nfsclusterlive-qrsq9e-node2-xqulsc[654968]: 25/04/2025 11:09:10 : epoch 680b6d4f : ceph-nfsclusterlive-qrsq9e-node2 : ganesha.nfsd-2[reaper] nfs_try_lift_grace :STATE :EVENT :check grace:reclaim complete(0) clid count(2)
Apr 25 11:09:10 ceph-nfsclusterlive-qrsq9e-node2 ceph-d55360cc-20dd-11f0-9cff-fa163eb24a5f-nfs-cephfs-nfs-0-0-ceph-nfsclusterlive-qrsq9e-node2-xqulsc[654968]: 25/04/2025 11:09:10 : epoch 680b6d4f : ceph-nfsclusterlive-qrsq9e-node2 : ganesha.nfsd-2[dbus] gsh_dbus_thread :DBUS :CRIT :DBUS not initialized, service thread exiting
Apr 25 11:09:10 ceph-nfsclusterlive-qrsq9e-node2 ceph-d55360cc-20dd-11f0-9cff-fa163eb24a5f-nfs-cephfs-nfs-0-0-ceph-nfsclusterlive-qrsq9e-node2-xqulsc[654968]: 25/04/2025 11:09:10 : epoch 680b6d4f : ceph-nfsclusterlive-qrsq9e-node2 : ganesha.nfsd-2[dbus] gsh_dbus_thread :DBUS :EVENT :shutdown
Apr 25 11:09:11 ceph-nfsclusterlive-qrsq9e-node2 systemd-coredump[655049]: Process 654972 (ganesha.nfsd) of user 0 dumped core.

                                                                           Stack trace of thread 68:
                                                                           #0  0x00007fbafd59c225 __memcpy_evex_unaligned_erms (libc.so.6 + 0x16a225)
                                                                           #1  0x00007fbafb028e84 n/a (/usr/lib64/ceph/libceph-common.so.2 + 0x49de84)
                                                                           ELF object binary architecture: AMD x86-64
Apr 25 11:09:12 ceph-nfsclusterlive-qrsq9e-node2 podman[655054]: 2025-04-25 11:09:12.038942001 +0000 UTC m=+0.023106154 container died d7cb2166ab33e85929416687165d75c8fed4bd0e61b3fcf442d50d9bfbfc3c31 (image=registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf0181a0987e49ad38becd1062baf79417c00825becae810e78806e23ee825a2, name=ceph-d55360cc-20dd-11f0-9cff-fa163eb24a5f-nfs-cephfs-nfs-0-0-ceph-nfsclusterlive-qrsq9e-node2-xqulsc, CEPH_POINT_RELEASE=, vcs-type=git, GIT_REPO=https://github.com/ceph/ceph-container.git, io.k8s.description=Red Hat Ceph Storage 8, GIT_BRANCH=main, ceph=True, io.openshift.tags=rhceph ceph, distribution-scope=public, url=https://access.redhat.com/containers/#/registry.access.redhat.com/rhceph/images/8-409, release=409, summary=Provides the latest Red Hat Ceph Storage 8 on RHEL 9 in a fully featured and supported base image., version=8, GIT_CLEAN=True, description=Red Hat Ceph Storage 8, io.buildah.version=1.33.12, io.k8s.display-name=Red Hat Ceph Storage 8 on RHEL 9, vendor=Red Hat, Inc., architecture=x86_64, GIT_COMMIT=55ad0f204a1d654ee565abf874aecad0cc209d0e, io.openshift.expose-services=, vcs-ref=2e5d3c0e666903d2aac3aa37c8f72bef249f276f, RELEASE=main, build-date=2025-04-24T03:39:50, name=rhceph, com.redhat.component=rhceph-container, com.redhat.license_terms=https://www.redhat.com/agreements, maintainer=Guillaume Abrioux <gabrioux>)


Version-Release number of selected component (if applicable):
==================

# rpm -qa | grep nfs
libnfsidmap-2.5.4-27.el9.x86_64
nfs-utils-2.5.4-27.el9.x86_64
nfs-ganesha-selinux-6.5-10.el9cp.noarch
nfs-ganesha-6.5-10.el9cp.x86_64
nfs-ganesha-rgw-6.5-10.el9cp.x86_64
nfs-ganesha-ceph-6.5-10.el9cp.x86_64
nfs-ganesha-rados-grace-6.5-10.el9cp.x86_64
nfs-ganesha-rados-urls-6.5-10.el9cp.x86_64
nfs-ganesha-utils-6.5-10.el9cp.x86_64

# ceph --version
ceph version 19.2.1-159.el9cp (99c759851ebe12f2e6a118b424029bb14a6efc5b) squid (stable)


How reproducible:
===============
2/2


Steps to Reproduce:
=================
1. Deploy the ceph cluster
2. Create NFS Ganesha cluster

# ceph nfs cluster info cephfs-nfs
{
  "cephfs-nfs": {
    "backend": [
      {
        "hostname": "ceph-nfsclusterlive-qrsq9e-node2",
        "ip": "10.0.66.199",
        "port": 2049
      }
    ],
    "virtual_ip": null
  }
}

3. Create NFS export and mount it on client

 Execution of mount -t nfs -o vers=4.1,port=2049 ceph-nfsclusterlive-qrsq9e-node2:/export_1 /mnt/nfs on 10.0.67.59 took 1.003478 seconds.

4. Run pynfs test suite

2025-04-25 07:06:46,018 - cephci - ceph:1570 - INFO - Execute python3 -m pip install ply;cd /mnt/nfs;git clone git://git.linux-nfs.org/projects/bfields/pynfs.git;cd pynfs;-- yes |python setup.py build;cd nfs4.1;./testserver.py ceph-nfsclusterlive-qrsq9e-node2:/export_0 -v --outfile ~/pynfs.run --maketree --showomit --rundep all on 10.0.66.72
2025-04-25 07:17:35,039 - cephci - ceph:1621 - ERROR - python3 -m pip install ply;cd /mnt/nfs;git clone git://git.linux-nfs.org/projects/bfields/pynfs.git;cd pynfs;-- yes |python setup.py build;cd nfs4.1;./testserver.py ceph-nfsclusterlive-qrsq9e-node2:/export_0 -v --outfile ~/pynfs.run --maketree --showomit --rundep all failed to execute within 600 seconds.
2025-04-25 07:17:35,275 - cephci - nfs_verify_pynfs:64 - ERROR - Failed to run pynfs on ceph-nfsclusterlive-qrsq9e-node4, Error:
2025-04-25 07:17:35,277 - cephci - ceph:1570 - INFO - Execute ls -Art /var/lib/systemd/coredump | tail -n 1 on 10.0.66.199
2025-04-25 07:17:36,281 - cephci - ceph:1600 - INFO - Execution of ls -Art /var/lib/systemd/coredump | tail -n 1 on 10.0.66.199 took 1.003007 seconds.
2025-04-25 07:17:36,282 - cephci - ceph:1570 - INFO - Execute stat -c '%w' /var/lib/systemd/coredump/core.ganesha\x2enfsd.0.1b1df88562f14760aabcef6c94857406.654972.1745579350000000.zst
 on 10.0.66.199
2025-04-25 07:17:37,286 - cephci - ceph:1600 - INFO - Execution of stat -c '%w' /var/lib/systemd/coredump/core.ganesha\x2enfsd.0.1b1df88562f14760aabcef6c94857406.654972.1745579350000000.zst
 on 10.0.66.199 took 1.003291 seconds.
2025-04-25 07:17:37,287 - cephci - run:854 - ERROR - stat -c '%w' /var/lib/systemd/coredump/core.ganesha\x2enfsd.0.1b1df88562f14760aabcef6c94857406.654972.1745579350000000.zst
 returned stat: cannot statx '/var/lib/systemd/coredump/core.ganeshax2enfsd.0.1b1df88562f14760aabcef6c94857406.654972.1745579350000000.zst': No such file or directory

Actual results:
=======
NFS server crashed


Expected results:
======
Pynfs should pass and no crashes should be observed


Additional info:

Comment 25 errata-xmlrpc 2025-06-26 12:31:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 8.1 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2025:9775


Note You need to log in before you can comment on or make changes to this bug.