Bug 2048832

Summary: conf_url_rados: verify reload of exports when notified by Ceph's object watch-notify mechanism
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Ram Raja <rraja>
Component: NFS-GaneshaAssignee: Venky Shankar <vshankar>
Status: NEW --- QA Contact: Vidushi Mishra <vimishra>
Severity: high Docs Contact:
Priority: medium    
Version: 5.1CC: aramteke, ffilz, fpantano, gfarnum, gfidente, gouthamr, hyelloji, kkeithle, mbenjamin, vereddy
Target Milestone: ---Keywords: CodeChange
Target Release: Backlog   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1961115, 2160010    

Description Ram Raja 2022-01-31 23:05:32 UTC
Description of problem:

By setting watch_url in RADOS_URLS config block, we can have nfs-ganesha watch a Ceph RADOS object. When an application (e.g., ceph's mgr/nfs module) wants ganesha to dynamically reload its exports, it notifies the watched RADOS object and makes nfs-ganesha send itself a SIGHUP. The watch callback handler, 'rados_url_watchcb', sends ack back to the notifier before sending SIGHUP to reload ganesha's exports [1]. This means that the application isn't aware of the result of the config reload. This is unlike DBus Add/Remove/Update Export interfaces that sends success or error message back to the application.

Proposed solution:

Jeff Layton suggested that the watch callback handler could directly call reread_config() and send back errors. This will mean that the watch callback will block and wait for some event. The change won't be trivial as possible deadlocks will need to be avoided. The Ceph context where the watch callback runs, and the librados mutexes held in it will need to be considered.

[1] https://github.com/nfs-ganesha/nfs-ganesha/commit/b75f5f84fc#diff-bd5efdcc946acd67af5f6149d67c6d651a6f4abb6cbb8823fba167e39fca6c32R364

Comment 1 RHEL Program Management 2022-01-31 23:05:39 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 2 Ram Raja 2022-01-31 23:19:47 UTC
Use case:

OpenStack Manila's CephFS NFS driver wants to use Ceph's object watch/notify mechanism to dynamically add/remove/update NFS exports. The driver is currently using DBus add/remove/update exports signals that sends back the result of the exports being reloaded. So the manila driver expects Ceph's object watch/notify mechanism to similarly return the result of ganesha export reloads.

Comment 4 Frank Filz 2022-01-31 23:52:43 UTC
That could certainly be done and would not be a huge effort, other than resolving any deadlock issues. Hopefully any locks could be dropped before invoking the Ganesha export reload since that would be a side effect of the data base changes.