Bug 2119426

Summary: [RFE][External mode] ODF CephFS External Mode Multi-tenancy on RHCS 4
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: iwatson
Component: rookAssignee: Parth Arora <paarora>
Status: CLOSED WONTFIX QA Contact: Neha Berry <nberry>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.10CC: brgardne, mrajanna, ocs-bugs, odf-bz-bot, paarora, tnielsen
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-02-07 15:08:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description iwatson 2022-08-18 13:56:44 UTC
This is a follow on discussion from https://bugzilla.redhat.com/show_bug.cgi?id=1898988

We are using ODF in external mode. 

Appreciated that the problem described here is solved in 1898988 on RHCS 5 with the use of multiple FS per cluster. In addition 1898988 also solves the issue for block on both RHCS 4 and RHCS 5. 

However on a a RHCS 4 cluster there is still a issue with tenant isolation for CephFS as there is only one FS per cluster. 

Given that some RHCS 4 clusters that are deployed via Openstack, they are unable to be upgraded to RHCS 5 until there is a valid Openstack upgrade path, we are looking for a tactical solution that will enable multiple Openshift tenant's to be isolated when using RHCS 4. 
The previous option presented in 1898988 was to make use of subvolume groups. 

Is this a viable option? 

I see that there is a '--subvolume-group' that I'm told is implemented upstream but not downstream.
The sub volumes could then be migrated onto separate file systems in RHCS 5.

Comment 6 Parth Arora 2022-08-22 12:00:27 UTC
Implementation would be straight forward just need to update mds caps of csi-cepfs-node and csi-cephfs-provisioner too:

"mds": "allow rw path=/volumes/cephFilesystemSubVolumeGroupName".

But the concern lies for `The sub volumes could then be migrated onto separate file systems in RHCS 5.` the upgrade will require migration and not sure if this is/will supported.

Comment 7 iwatson 2022-09-06 09:28:21 UTC
After speaking to Parth we have a potential workaround and a feature request.

The workaround is to create a subvolume on the external Ceph clusters per Openshift cluster. Then create a new CephFS StorageClass outside of the control of ODF with the Ceph SubvolumeGroup ID specified as the CephClusterID. When creating the user ODF will use, we need to limit the capabilities of the user to the subvolume group. 

The capabilities can be limited to "mds": "allow rw path=/volumes/cephFilesystemSubVolumeGroupName"

This workaround is currently untested but will be tested soon.


The title of the request is "When using ODF external mode on OCP on Openstack where the external Ceph cluster is RHCS 4, ODF should accept a subvolume ID to provide Openshift tenant isolation using achieved using RBAC capabilities."

ie instead of creating a new StorageClass outside of ODF's control, ODF should support providing the subvolume ID and creating restricting the capabilities of the user it creates on the external Ceph cluster to that of the subvolume.

Comment 9 iwatson 2022-10-11 14:02:40 UTC
I have largely got this working.

The following steps were performed on the provider cluster:

- create a single cephfs file system
- create a new data pool per cluster and add it to the file system aka  ceph fs add_data_pool cephfs cephfs-cluster-pool
- create a cephsubvolumegroup aka ceph fs subvolumegroup create cephfs cephfs-cluster --pool_layout=cephfs-cluster-pool
- run the ceph-external-cluster-details-exporter.py script with a few changes. 
- the caps for csi-cephfs-provisioner are amended for mds too become "mds allow rw path=volumes/cephfs-cluster"
- the caps for csi-cephfs-node are amended for mds to become "mds allow rw path=volumes/cephfs-cluster" and osd set to "allow rw tag cephfs *=*"

The following steps were then performed on the consumer cluster:
- the Openshift consumer is setup to install a ODF StorageSystem
- when storagesystem is created, a Cephsubvolumegroup resource is manually created. This populates a config map with a MonitoringEndpoint array with a UID that matches the subvolumegroup status object. 
- the Openshift consumer deletes and recreates the StorageClass for cephfs to change the file system name to the UID from the status field. I also change the pool name to match the data pool used by the subvolumegroup id. 



I have discussed the requirement/setup with Orit (ODF architect), Orit understood the issue at hand and was happy from a architecture view to support the cephsubvolume's and suggested to approach the PM's to see if its something they would like to support. Currently my thought are to hold off until we finalize the last piece of the puzzle below:

The last bit I'm working on is the caps. I want to ensure that the caps given for a cluster are the least privileged we can make. There may be a issue with the both the csi-cephfs-provisioner/node having access to the cephfs_metadata pool but Orit's view was that this is a internal component to cephfs and not directly exposed to the end user for compromise. 

The current issue with the caps is in this situation I want to restrict the user to a certain pool at the OSD level. That way we have MDS at subvolumegroup level and OSD at the pool level for the best isolation we can get.

Comment 10 Travis Nielsen 2022-10-17 15:14:23 UTC
Moving out of 4.12 unless there is a clear requirement

Comment 11 iwatson 2022-10-26 12:59:26 UTC
There is one outstanding issue to do with the cephfs_metadata pool access control. 

In order to provide isolation across the subvolumegroups, the subvolume must be created in a seperate rados namespace inside the cephfs_metadata pool. A cap can then be provided to limit the users options to only access this rados namespace. 

The code reference is here
https://github.com/ceph/ceph-csi/blob/devel/internal/cephfs/core/volume.go#L251

and a option `NamespaceIsolated: true` is needed.

The CephFs storageclass will then use the volumeNamePrefix to set a prefix such as 'csi-vol-group1'

The caps for the osd will then look similar too:

"osd", "allow rw pool=pool1, allow rw pool=cephfs_metadata namespace=csi-vol-group1*"

Comment 23 Travis Nielsen 2023-02-07 15:08:19 UTC
No action is planned at this time, please reopen if further investigation is needed