Bug 2375725
| Summary: | [8.1z backport] [NFS-Ganesha] After NFS server hosting node reboot, new mount fails with posix2fsal_error and requires nfs service restart for mount to suceed | |||
|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | sumr | |
| Component: | Cephadm | Assignee: | Shweta Bhosale <shbhosal> | |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Manisha Saini <msaini> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 8.1 | CC: | bkunal, cephqe-warriors, kkeithle, msaini, shbhosal, tserlin | |
| Target Milestone: | --- | Keywords: | External | |
| Target Release: | 8.1z2 | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | ceph-19.2.1-232.el9cp | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2385955 (view as bug list) | Environment: | ||
| Last Closed: | 2025-12-03 15:46:00 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 2385955 | |||
| Bug Blocks: | ||||
Description of problem: If NFS service hosting ceph node is restarted, new NFS service is automatically spinned on another ceph node. So a remount with NFS server IP is tried, but NFS mount fails and has below error in NFS dbg log(complete log will be copied to magna002 server) 8638cb6 : ceph-sumar-nfs-byok-f9le1w-node5 : ganesha.nfsd-2[svc_12] posix2fsal_error :FSAL :CRIT :Default case mapping Cannot send after transport endpoint shutdown (108) to ERR_FSAL_SERVERFAULT Even a NFS mount for new export with new subvolume will not succeed. NFS service restart is required, after restart, existing exports can be used for remount with new NFS server IP. Logs: 1. Run IOs on existing NFS mount, root 36157 35884 1 07:20 pts/1 00:00:00 python3 /home/cephuser/smallfile/smallfile_cli.py --operation append --threads 10 --file-size 10 --files 100 --top /mnt/cephfs_sv5/smallfile_dir0 2. Restart Ceph node hosting NFS server. A new NFS server is spin up on another node. [root@ceph-sumar-nfs-byok-f9le1w-node8 mnt]# ceph orch ps|grep cephfs-nfs nfs.cephfs-nfs.0.0.ceph-sumar-nfs-byok-f9le1w-node2.ltulkl ceph-sumar-nfs-byok-f9le1w-node2 *:2049 running (25h) 5m ago 25h 1028M - 6.5 b8860365707a 095901a79ad8 [root@ceph-sumar-nfs-byok-f9le1w-node8 mnt]# ceph orch ps|grep cephfs-nfs nfs.cephfs-nfs.0.0.ceph-sumar-nfs-byok-f9le1w-node2.ltulkl ceph-sumar-nfs-byok-f9le1w-node2 *:2049 host is offline 6m ago 25h 1028M - 6.5 b8860365707a 095901a79ad8 nfs.cephfs-nfs.0.1.ceph-sumar-nfs-byok-f9le1w-node5.zzqkkt ceph-sumar-nfs-byok-f9le1w-node5 *:2049 running (28s) 27s ago 28s 14.7M - 6.5 b8860365707a b1309d379e7f 3. Existing mountpoint is not accesible, umount and remount to new NFS server IP [root@ceph-sumar-nfs-byok-f9le1w-node8 mnt]# ls -l cephfs_sv5 ^C [root@ceph-sumar-nfs-byok-f9le1w-node8 mnt]# umount -l /mnt/cephfs_sv5 [root@ceph-sumar-nfs-byok-f9le1w-node8 mnt]# mount -t nfs 10.0.66.14:/cephfs_sv5 /mnt/cephfs_sv5 mount.nfs: Remote I/O error [root@ceph-sumar-nfs-byok-f9le1w-node8 mnt]# ceph nfs export info cephfs-nfs /cephfs_sv5 { "access_type": "RW", "clients": [], "cluster_id": "cephfs-nfs", "export_id": 6, "fsal": { "cmount_path": "/", "fs_name": "cephfs", "name": "CEPH", "user_id": "nfs.cephfs-nfs.cephfs.5d839b2a" }, "kmip_key_id": "KEY-b0b60ad-da046d8e-a406-428f-95f6-68d5ddf38d9f", "path": "/volumes/_nogroup/sv5/0aed206b-4fa8-4b6e-ae1f-9fe6d5bf9048", "protocols": [ 3, 4 ], "pseudo": "/cephfs_sv5", "security_label": true, "squash": "none", "transports": [ "TCP" ] } [root@ceph-sumar-nfs-byok-f9le1w-node8 mnt]# ceph nfs cluster info cephfs-nfs { "cephfs-nfs": { "backend": [ { "hostname": "ceph-sumar-nfs-byok-f9le1w-node5", "ip": "10.0.66.14", "port": 2049 } ], "virtual_ip": null } } 4. Even the new export mount of new subvolume will not suceed. [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# ceph fs subvolume create cephfs sv6 [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# ceph fs subvolume getpath cephfs sv6 /volumes/_nogroup/sv6/bf5e6c4d-cce8-4ce4-be02-83bc4582916d [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# ceph nfs export create cephfs cephfs-nfs /cephfs_sv6 cephfs --path /volumes/_nogroup/sv6/bf5e6c4d-cce8-4ce4-be02-83bc4582916d { "bind": "/cephfs_sv6", "cluster": "cephfs-nfs", "fs": "cephfs", "mode": "RW", "path": "/volumes/_nogroup/sv6/bf5e6c4d-cce8-4ce4-be02-83bc4582916d" } [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# mkdir /mnt/cephfs_sv6 [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# mount -t nfs 10.0.66.14:/cephfs_sv6 /mnt/cephfs_sv6 mount.nfs: access denied by server while mounting 10.0.66.14:/cephfs_sv6 5. Restart NFS service and Remount of existing NFS export will now suceed [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# ceph orch restart nfs.cephfs-nfs Scheduled to restart nfs.cephfs-nfs.0.1.ceph-sumar-nfs-byok-f9le1w-node5.zzqkkt on host 'ceph-sumar-nfs-byok-f9le1w-node5' [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# ceph orch ps --refresh|grep cephfs-nfs nfs.cephfs-nfs.0.1.ceph-sumar-nfs-byok-f9le1w-node5.zzqkkt ceph-sumar-nfs-byok-f9le1w-node5 *:2049 running (2s) 1s ago 51m 18.3M - 6.5 b8860365707a 2eb3fdd384ac [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# mount -t nfs 10.0.66.14:/cephfs_sv6 /mnt/cephfs_sv6 [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# mount -t nfs 10.0.66.14:/cephfs_sv5 /mnt/cephfs_sv5 [root@ceph-sumar-nfs-byok-f9le1w-node8 4eabc83d-3e41-449a-a03b-77bfab144e1d]# ls -l /mnt/cephfs_sv5 total 12227 -rw-------. 1 root root 12515529 Jul 1 07:00 messages drwxr-xr-x. 5 root root 4098 Jul 1 07:23 smallfile_dir0 Version-Release number of selected component (if applicable):19.2.1-227.el9cp How reproducible: Steps to Reproduce: Setup : Standalone NFS server/non-HA 1.Create NFS export with kmip key and perform nfs mount. Run IO. 2.While IO in-progress, reboot the nfs server hosting ceph node. 3. New NFS server is spin-up. Remount the NFS export with new NFS server IP. Observation: mount.nfs: Remote I/O error 4. Restart NFS service and retry NFS mount. Observation : NFS mount suceeds Actual results: NFS remount with new NFS server IP requires NFS service restart Expected results:NFS remount with new NFS server IP should suceed without NFS service restart Additional info: