Bug 2331781 - cephadm generates nfs-ganesha config with incorrect server_scope value
Summary: cephadm generates nfs-ganesha config with incorrect server_scope value
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Cephadm
Version: 6.0
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
: 8.1
Assignee: Adam King
QA Contact: Manisha Saini
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-12-11 20:39 UTC by jeff.a.smith
Modified: 2025-06-26 12:20 UTC (History)
5 users (show)

Fixed In Version: ceph-19.2.1-33.el9cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2025-06-26 12:20:06 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-10342 0 None None None 2024-12-11 20:40:30 UTC
Red Hat Product Errata RHSA-2025:9775 0 None None None 2025-06-26 12:20:09 UTC

Description jeff.a.smith 2024-12-11 20:39:49 UTC
Description of problem: By default, cephadm configures a nfs-ganesha cluster to embed hostname inside Sever_Scope.  For NFS4.1+ clusters, this breaks state recovery after moving a ganesha instance (and vip) to another protocol node.  The client reacts to the change in "Server_Scope" by not even attempting to recover state created from the prior incarnation/epoch of the nfs-ganesha instance that moved to another node.

Instead, cephadm should configure every ganesha instance in the NFS cluster to use the same server_scope value by embedding name of the nfs-ganesha cluster instead of hostname.

Version-Release number of selected component (if applicable):


How reproducible: 100% for nfs-ganesha clusters spanning multiple protocol nodes


Steps to Reproduce:
1. with cephadm cli, create a nfs-ganesha cluster spanning multiple protocol nodes
2. dump the nfs-ganesha config for each ganesha instance in the cluster and verify that the value of Server_Scope is the same.   Currently this value is different for each instance because by default cephadm embeds hostname within the server_scope value.
3. 

Actual results:
each ganesha instance in a nfs-ganesha cluster configured by cephadm has a unique value for Server_Scope


Expected results:
each ganesha instance in the same cluster must be configured to use the same value for Server_Scope

Additional info:
These experiments were conducted using Acadia Storage clusters, but Acadia storage is not required.   This is just a bug in cephadm with the default config value for Server_Scope when creating a nfs-ganesha cluster.

Comment 1 Storage PM bot 2024-12-11 20:40:00 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 4 jeff.a.smith 2024-12-13 19:33:06 UTC
I set severity to medium as the default ganesha config produced by cephadm results in a nfs-ganesha cluster such that each ganesha instance is configured with a unique value for "Server_Scope" (the default behavior embeds hostname in Server_Scope).  The default nfs-ganesha config cannot support HANFS.  The impact is that NFS4.1+ clients of a ganesha instance that is moved to another node will never be able to reclaim/recover their protocol state.

This problem is easily mitigated by setting a Server_Scope value that is that same for all ganesha instances.   We set Server_Scope to the nfs cluster name.

Comment 12 errata-xmlrpc 2025-06-26 12:20:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 8.1 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2025:9775


Note You need to log in before you can comment on or make changes to this bug.