Hide Forgot
Description of problem: Modify the Ceph FSAL to mount the subtree instead of the entire filesystem for each export. And mount using a ceph auth ID with the subtree path restricted MDS caps rather than needing to use auth ID 'admin' having no MDS path caps restrictions. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Looks reasonable. I've taken a brief look and see what we'll have to rework in ganesha to make this happen. I think we have to change the lookup_path prototype to deal with a path prefix (representing the path up to the root of the export) and any path relative to that namespace root. OTOH, if we never look up anything but the root of the export, then maybe we don't need lookup_path to be that flexible, and can just teach ceph to ignore the path and always just look up the root of the export.
(In reply to Jeff Layton from comment #1) > Looks reasonable. I've taken a brief look and see what we'll have to rework > in ganesha to make this happen. I think we have to change the lookup_path > prototype to deal with a path prefix (representing the path up to the root > of the export) and any path relative to that namespace root. > > OTOH, if we never look up anything but the root of the export, then maybe we > don't need lookup_path to be that flexible, and can just teach ceph to > ignore the path and always just look up the root of the export. lookup_path is used to set up an export, and it is used for NFS v3 mount of a sub-directory of the export.
I'll take this for now since I'm working on a patch. The first step, I think is to mount the subtree using the same creds we do now. Once we have that, we can look at adding a new ganesha config option to make it use different creds.
Gerritt review request is here: https://review.gerrithub.io/#/c/305345/ Patch turned out to be a little simpler than expected, and seems to work just fine with everything that I've pointed at it so far. The next step is to add a way to give each export a different cephx user. What I'm not clear on is how to manipulate cephx creds from the libcephfs api. Maybe there is some ceph_conf_set option that we can use?
> The next step is to add a way to give each export a different cephx user. What > I'm not clear on is how to manipulate cephx creds from the libcephfs api. Maybe > there is some ceph_conf_set option that we can use? Why would the libcephfs API want to manipulate the cephx creds? I was thinking that a per export FSAL_CEPH specific user option would supply the ceph auth ID (already created with path restricted MDS caps), which would be used for each export. So the FSAL_CEPH would pass the auth ID as an argument for `ceph_create`?
Ahh, that's what I was missing. Yes, passing a user string argument to ceph_create is what we'd want. I'll look at what we'd need to add a new config option to the FSAL CEPH section in the ganesha configs.
Ok! I think I have something that may work for you in my ganesha ceph-submount branch. See: https://github.com/jtlayton/nfs-ganesha/tree/ceph-submount Ram, if you have the time, could you try that out? This branch should build against kraken (with libcephfs2) or jewel (with libcephfs1) though you may need to drop the specfile patch in that pile if you're building against packages that have a libcephfs1-devel package. You should be able to set the export FSAL user_id and secret_access_key options in the config file, or using dbus. If you provide a user_id but no secret key, then it should try to find the key for that user in the usual keyring files.
Note that this branch builds, but is otherwise untested!
Ramana said that he tried the patches but that they weren't working as expected. Ramana, can you maybe detail how you're testing this? Specifically, how the new users and keys are being created, and what values you're providing in the new configuration options.
Jeff, sorry about the delay. Testing steps ------------- 1. Installed jewel version of Ceph in a VM. $ ceph --version ceph version 10.2.4-3-gc461ee1 (c461ee19ecbc0c5c330aca20f7392c9a00730367) 2. Setup an all-in-one Ceph backend (1 MON, 1MDS, 1 OSD) in the VM. $ sudo ceph -s cluster 0df98128-b7e6-434c-96c7-5f729623d1b6 health HEALTH_OK monmap e1: 1 mons at {osboxes=10.0.2.15:6789/0} election epoch 8, quorum 0 osboxes fsmap e35: 1/1/1 up {0=a=up:active} osdmap e30: 1 osds: 1 up, 1 in flags sortbitwise,require_jewel_osds pgmap v3436: 80 pgs, 3 pools, 984 kB data, 24 objects 160 MB used, 8021 MB / 8182 MB avail 80 active+clean Ceph config file, $ cat /etc/ceph/ceph.conf [global] fsid = 0df98128-b7e6-434c-96c7-5f729623d1b6 mon_initial_members = osboxes mon_host = 10.0.2.15 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx filestore_xattr_use_omap = true osd crush chooseleaf type = 0 osd journal size = 100 osd pool default size = 1 rbd default features = 1 [client.manila] client mount uid = 0 client mount gid = 0 3. Installed NFS-Ganesha from source (using your Ganesha dev branch) in the same VM. $ cd ~/git/nfs-ganesha/ && cat .git/refs/heads/ceph-submount 059024237ca2968d4d1670d14551b2befdbe06d7 4. Allowed Ganesha to mount a CephFS subtree using * user ID 'manila' and its secret key passed as FSAL options * user ID 'manila' passed as a FSAL option. the keyring file in the default location Both seemed to work. Note that 'manila' user *did not* have any path restricted MDS caps. $ sudo ceph auth get client.manila exported keyring for client.manila [client.manila] key = AQBaUylYMrj9EBAA33JPniv+QhVjeXnETocGCw== caps mds = "allow *" caps mon = "allow *" caps osd = "allow rw" $ cat /etc/ganesha/ganesha.conf EXPORT { Export_ID=100; Path = /volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97; Pseudo = /volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97; Protocols = 4; Transports = TCP; FSAL { Name = CEPH; User_Id = "manila"; Secret_Access_Key = "AQBaUylYMrj9EBAA33JPniv+QhVjeXnETocGCw=="; } CLIENT { Clients = 10.0.2.13; Access_Type = RO; } CLIENT { Clients = 10.0.2.15; Access_Type = RW; } } $ sudo ganesha.nfsd -f /etc/ganesha/ganesha.conf -L /tmp/ganesha-user-manila-secret-key.log -N NIV_DEBUG $ sudo ceph daemon mds.a session ls [ { "id": 54109, "num_leases": 0, "num_caps": 4, "state": "open", "replay_requests": 0, "completed_requests": 0, "reconnecting": false, "inst": "client.54109 10.0.2.15:0\/2337762318", "client_metadata": { "ceph_sha1": "c461ee19ecbc0c5c330aca20f7392c9a00730367", "ceph_version": "ceph version 10.2.4-3-gc461ee1 (c461ee19ecbc0c5c330aca20f7392c9a00730367)", "entity_id": "manila", "hostname": "osboxes", "root": "\/" } } ] Here I've a question. I was expecting the "root"'s value to be the subtree path in the "client_metadata" section, since ganesha mounts only the subtree? 5. I wasn't able to get Ganesha to mount a subtree using a user ID with path restricted MDS caps. $ sudo ceph auth get client.alice exported keyring for client.alice [client.alice] key = AQCj1i5YOO/gOhAADVpTk6XTXNoERTpvPyUfqQ== caps mds = "allow rw path=/volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97" caps mon = "allow r" caps osd = "allow rw pool=cephfs_data namespace=fsvolumens_b86384b2-c52c-4607-bcaf-ff294acf1b97" $ cat /etc/ganesha/ganesha.conf EXPORT { Export_ID=100; Path = /volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97; Pseudo = /volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97; Protocols = 4; Transports = TCP; FSAL { Name = CEPH; User_Id = "alice"; Secret_Access_Key = "AQCj1i5YOO/gOhAADVpTk6XTXNoERTpvPyUfqQ=="; } CLIENT { Clients = 10.0.2.13; Access_Type = RO; } CLIENT { Clients = 10.0.2.15; Access_Type = RW; } } I observed the following error in the ganesha.log after starting the ganesha server, create_export :FSAL :CRIT :Unable to mount Ceph cluster for /volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97. mdcache_fsal_create_export :FSAL :MAJ :Failed to call create_export on underlying FSAL Ceph fsal_put :FSAL :INFO :FSAL Ceph now unused l_cfg_commit :CONFIG :CRIT :Could not create export for (/volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97) to (/volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97) build_default_root :CONFIG :DEBUG :Allocating Pseudo root export pseudofs_create_export :FSAL :DEBUG :Created exp 0x10809c0 - / build_default_root :CONFIG :INFO :Export 0 (/) successfully created main :NFS STARTUP :WARN :No export entries found in configuration file !!! config_errs_to_log :CONFIG :CRIT :Config File (/etc/ganesha/ganesha.conf:13): 1 validation errors in block FSAL config_errs_to_log :CONFIG :CRIT :Config File (/etc/ganesha/ganesha.conf:13): Errors processing block (FSAL) config_errs_to_log :CONFIG :CRIT :Config File (/etc/ganesha/ganesha.conf:1): 1 validation errors in block EXPORT config_errs_to_log :CONFIG :CRIT :Config File (/etc/ganesha/ganesha.conf:1): Errors processing block (EXPORT) And the line number 13 in the ganesha.conf was the CEPH FSAL sub-block. I guess when trying to ceph_mount using 'alice' here, https://github.com/jtlayton/nfs-ganesha/commit/077a2d26b1863a1f42ef54e14446d423af6c016a#diff-548844599138ebf79915b9d8e9e59031R215 hits EPERM? However, I was able to mount the subtree using the same auth ID 'alice' via ceph-fuse. $ sudo ceph-fuse /mnt/fuse/ --id=alice --client-mountpoint=/volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97 $ sudo ceph daemon mds.a session ls [ { "id": 54120, "num_leases": 0, "num_caps": 1, "state": "open", "replay_requests": 0, "completed_requests": 0, "reconnecting": false, "inst": "client.54120 10.0.2.15:0\/2847125433", "client_metadata": { "ceph_sha1": "c461ee19ecbc0c5c330aca20f7392c9a00730367", "ceph_version": "ceph version 10.2.4-3-gc461ee1 (c461ee19ecbc0c5c330aca20f7392c9a00730367)", "entity_id": "alice", "hostname": "osboxes", "mount_point": "\/mnt\/fuse", "root": "\/volumes\/_nogroup\/b86384b2-c52c-4607-bcaf-ff294acf1b97" } } ]
Yeah, that "ceph ls" output does look odd. "alice" made it to the "entity_id", but AFAICT, "id" is set to the default and I suspect that that is what gets used. libcephfs.h says: /** * Create a mount handle for interacting with Ceph. All libcephfs * functions operate on a mount info handle. * * @param cmount the mount info handle to initialize * @param id the id of the client. This can be a unique id that identifies * this client, and will get appended onto "client.". Callers can * pass in NULL, and the id will be the process id of the client. * @returns 0 on success, negative error code on failure */ int ceph_create(struct ceph_mount_info **cmount, const char * const id); Which sure makes it sound like that would be the "username" for the cephx creds. Maybe I'll roll up a ceph regression test for this as well...
Actually...Ramana can you try one other thing? Assuming that the keyring file is in a standard location, does it work if you comment out the Secret_Access_Key field, which should let ganesha scrape the key out of the keyring? Maybe we're just misunderstanding how that field needs to be set...
I hit the same error when I didn't pass the Secret_Access_Key FSAL option for user 'alice' (with path restricted MDS caps), and 'alice' user's keyring file was in the standard location. I'd be surprised if it's an issue with setting the Secret_Access_Key field. The export creation succeeded when I set the Secret_Access_Key field for user 'manila' (with *no* path restricted MDS caps); and export creation failed when I intentionally set the incorrect Secret_Access_Key for user 'manila'. And in both cases 'manila' user's keyring file was not in the default location.
Got it...thanks. I do seem to recall some discussion about submounts during the kraken development cycle which led me to believe that they might not work correctly in jewel. That sasid, I think I'm going to have to roll up a test program to understand how the path restricted caps should work...
Ok, the problem turns out to be a ceph bug: http://tracker.ceph.com/issues/18254 I have a patch to fix this in ceph, but it'll take a while to trickle out to the stable releases. In the meantime, I've updated my ganesha branch with a workaround for this bug. The workaround is pretty harmless so I'm inclined to merge it into ganesha and just live with it in perpetuity. Ramana, can you try my latest ceph-submount ganesha branch and let me know whether it works?
Setting the `client_mountpoint` conf option seems to have done the trick. https://github.com/jtlayton/nfs-ganesha/commit/1f76a9ec739ad20190f2e1b469e1464c7cb26cc6#diff-548844599138ebf79915b9d8e9e59031R261 Yes, now Ganesha is able to mount the CephFS subtree using an auth ID (and secret key) whose MDS caps are restricted to access only the subtree. 1. User 'alice' with MDS caps restricted to a subtree `/volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97` $ sudo ceph auth get client.alice exported keyring for client.alice [client.alice] key = AQCj1i5YOO/gOhAADVpTk6XTXNoERTpvPyUfqQ== caps mds = "allow rw path=/volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97" caps mon = "allow r" caps osd = "allow rw pool=cephfs_data namespace=fsvolumens_b86384b2-c52c-4607-bcaf-ff294acf1b97" 2. Set ganesha.conf to export the subtree that was mounted by the FSAL_CEPH using ceph user 'alice'. $ cat /etc/ganesha/ganesha.conf EXPORT { Export_ID=100; Path = /volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97; Pseudo = /volumes/_nogroup/b86384b2-c52c-4607-bcaf-ff294acf1b97; Protocols = 4; Transports = TCP; FSAL { Name = CEPH; User_Id = "alice"; Secret_Access_Key = "AQCj1i5YOO/gOhAADVpTk6XTXNoERTpvPyUfqQ=="; } CLIENT { Clients = 10.0.2.15; Access_Type = RW; } } 3. Starting the ganesha server created an export. And I could see that subtree was mounted using user 'alice' by listing the Ceph MDS's client sessions. $ sudo ceph daemon mds.a session ls [ { "id": 64103, "num_leases": 0, "num_caps": 4, "state": "open", "replay_requests": 0, "completed_requests": 2, "reconnecting": false, "inst": "client.64103 10.0.2.15:0\/2287299420", "client_metadata": { "ceph_sha1": "c461ee19ecbc0c5c330aca20f7392c9a00730367", "ceph_version": "ceph version 10.2.4-3-gc461ee1 (c461ee19ecbc0c5c330aca20f7392c9a00730367)", "entity_id": "alice", "hostname": "osboxes", "root": "\/volumes\/_nogroup\/b86384b2-c52c-4607-bcaf-ff294acf1b97" } } ]
Great! I've got a PR up with the fix for ceph, but I think keeping the workaround in ganesha for a while is a reasonable thing to do as well. Now that we have a PoC that works, we should discuss whether this design makes sense. The ceph docs sort of indicate that managing keys directly is a bad idea and that we should use keyring files. I'm not 100% convinced there, but we could replace the key option with a global CEPH section config option to point ganesha at a keyring file instead. Would that be simpler for the manila integration work? IOW, I'd like your opinion on where you think the keys should ideally be stored.
> Great! I've got a PR up with the fix for ceph, but I think keeping the workaround in ganesha for a while is a reasonable thing to do as well. Makes sense. > Now that we have a PoC that works, we should discuss whether this design makes sense. The ceph docs sort of indicate that managing keys directly is a bad idea and that we should use keyring files. Yeah, we need to check whether this a bad idea for our use case. FSAL_RGW allows passing of a secret_access_key, https://github.com/nfs-ganesha/nfs-ganesha/commit/674f265c#diff-1902eb57168c2420d3657a4cd7052e37R19 so it might be OK for FSAL_CEPH to do so too? > I'm not 100% convinced there, but we could replace the key option with a global CEPH section config option to point ganesha at a keyring file instead. Would that be simpler for the manila integration work? My thoughts on Manila/Ganesha/CephFS integration: When a Manila user requests for IP access to a CephFS subdir: * Manila's Ganesha driver using `ceph_volume_client.py` would create a cephx user (if it does not already exist) with path restricted MDS caps, and fetch the user's secret key. This is the per-share cephx user and secret key that FSAL_CEPH would use to mount the Ceph subtree. * Then the ganesha driver would construct a export block like, EXPORT { Export_ID=100; Path = $cephfs_subdir_path; Pseudo = $cephfs_subdir_path; Protocols = 4; Transports = TCP; FSAL { Name = CEPH; User_Id = "<manila_share_uuid>; Secret_Access_Key = "<secret-key>"; } CLIENT { Clients = <ip> Access_Type = RW; } } to allow IP access. The export block file would be written to disk for persistence across Ganesha server restarts, and the export would be dynamically added via DBUS. * For subsequent IP access rule changes of the share, the Ganesha driver would manipulate only the CLIENT sublock of the export block, write the changes back to the export block file, and then dynamically update the export via DBUS. Hope this description helps.
I'll leave that up to you...I don't know much about the environment where this thing runs though. AIUI, cephx keyring files are a lot like krb5 keytabs. The main reason for using keyring files would be that you could probably keep permissions on a keyring file locked a little more tightly than the ganesha config file. Managing the keys in the ganesha config file should be ok, but you'll need to keep read permissions on ganesha.conf locked down to protect the keys (depending on who has access to the host, of course).
Frank has merged the patchset into the ganesha next branch, so it should be good to go in v2.5.