Created attachment 1960969 [details] must gather logs rbd_default_map_options is not set to ms_mode=secure in an in-transit encryption enabled ODF 4.13 cluster. in-transit encryption is enabled after deployment of ODF 4.13 cluster Version of all relevant components (if applicable): full_version: 4.13.0-179 [root@rdr-cicd-odf-60c6-bastion-0 4.13]# oc version Client Version: 4.13.0-rc.2 Kustomize Version: v4.5.7 Server Version: 4.13.0-0.nightly-ppc64le-2023-04-28-143059 Kubernetes Version: v1.26.3+b404935 [root@rdr-cicd-odf-60c6-bastion-0 4.13]# Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? No Is there any workaround available to the best of your knowledge? No Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Yes Can this issue reproduce from the UI? No If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deploy ODF 4.13 via UI (without enabling in-transit encryption) 2. Enable in-transit encryption from cli oc patch storagecluster ocs-storagecluster -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/network/connections/encryption", "value": {"enabled": true} }]' 3. Check Ceph config a. oc patch OCSInitialization ocsinit -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]' b. oc rsh -n openshift-storage rook-ceph-tools-7c5884d455-lzgnp c. sh-5.1$ ceph config dump WHO MASK LEVEL OPTION VALUE RO global basic log_to_file true global advanced mon_allow_pool_delete true global advanced mon_allow_pool_size_one true global advanced mon_cluster_log_file global advanced mon_pg_warn_min_per_osd 0 global basic ms_client_mode secure * global basic ms_cluster_mode secure * global basic ms_service_mode secure * global advanced rbd_default_map_options ms_mode=prefer-crc * mon advanced auth_allow_insecure_global_id_reclaim false mgr advanced mgr/balancer/mode upmap mgr advanced mgr/prometheus/rbd_stats_pools ocs-storagecluster-cephblockpool * mds.ocs-storagecluster-cephfilesystem-a basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-a basic mds_join_fs ocs-storagecluster-cephfilesystem mds.ocs-storagecluster-cephfilesystem-b basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-b basic mds_join_fs ocs-storagecluster-cephfilesystem client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_enable_usage_log true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_nonexistent_bucket true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_object_name_utc true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zone ocs-storagecluster-cephobjectstore * client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zonegroup ocs-storagecluster-cephobjectstore * sh-5.1$ sh-5.1$ sh-5.1$ Actual results: rbd_default_map_options ms_mode=prefer-crc Expected results: rbd_default_map_options ms_mode=secure Additional info:
How soon did you check the ms_mode after updating the cluster to enable encryption? It can take a couple minutes for the Rook operator reconcile to apply the new ms_mode with encryption enabled. In the attached must gather, the rook operator log shows that the setting was applied successfully. 2023-04-29T10:27:19.516475943Z 2023-04-29 10:27:19.516450 I | op-config: applying ceph settings: 2023-04-29T10:27:19.516475943Z [global] 2023-04-29T10:27:19.516475943Z ms_cluster_mode = secure 2023-04-29T10:27:19.516475943Z ms_service_mode = secure 2023-04-29T10:27:19.516475943Z ms_client_mode = secure 2023-04-29T10:27:19.516475943Z rbd_default_map_options = ms_mode=secure 2023-04-29T10:27:20.467038857Z 2023-04-29 10:27:20.466961 I | op-config: successfully applied settings to the mon configuration database This appears to be the expected behavior. If you confirm there is not another issue, we can close the bug.
I did check the Ceph config dump again , I still don't see ms_mode=secure [root@rdr-cicd-odf-60c6-bastion-0 ~]# oc rsh -n openshift-storage rook-ceph-tools-7c5884d455-lzgnp sh-5.1$ sh-5.1$ history 1 history sh-5.1$ ceph config dump WHO MASK LEVEL OPTION VALUE RO global basic log_to_file true global advanced mon_allow_pool_delete true global advanced mon_allow_pool_size_one true global advanced mon_cluster_log_file global advanced mon_pg_warn_min_per_osd 0 global basic ms_client_mode secure * global basic ms_cluster_mode secure * global basic ms_service_mode secure * global advanced rbd_default_map_options ms_mode=prefer-crc * mon advanced auth_allow_insecure_global_id_reclaim false mgr advanced mgr/balancer/mode upmap mgr advanced mgr/prometheus/rbd_stats_pools ocs-storagecluster-cephblockpool * mds.ocs-storagecluster-cephfilesystem-a basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-a basic mds_join_fs ocs-storagecluster-cephfilesystem mds.ocs-storagecluster-cephfilesystem-b basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-b basic mds_join_fs ocs-storagecluster-cephfilesystem client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_enable_usage_log true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_nonexistent_bucket true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_object_name_utc true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zone ocs-storagecluster-cephobjectstore * client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zonegroup ocs-storagecluster-cephobjectstore * sh-5.1$ But, if I enable the encryption from UI during the storage cluster creation , the value is set to ms_mode=secure
>> How soon did you check the ms_mode after updating the cluster to enable encryption? btw, the above dump is after a day
Could I connect to this cluster if it's still up? The operator log shows that value was applied successfully, so I don't understand why the ceph config dump is not showing the option.
The assimilate-conf is silently failing to set the ms_mode. If I run this command manually in the toolbox: sh-5.1$ ceph config assimilate-conf -i settings.txt -o out.txt sh-5.1$ echo $? 0 sh-5.1$ cat out.txt [global] rbd_default_map_options = ms_mode=secure The out.txt returns that the setting was invalid, even though the return value succeeded. Still investigating why this is failing to be set...
Madhu, could you take a look at how the rbd_default_map_options needs to be applied? Details for the failure are in the previous comment. This command succeeds: ceph config set client.admin rbd_default_map_options ms_mode=secure However, this isn't seem correct if we want all clients to be in secure mode, not just the admin client. I not able to repro this issue in minikube with latest upstream rook and ceph 17.2.6. Perhaps there was an issue with the assimilate-conf in the ceph release being used? For now, perhaps we need to set this separately from assimilate-conf. I'm proposing this as a 4.13 blocker since without it we don't have encryption on the wire.
When can we expect a build with the fix ?
Verified on build 4.13.0-186, working as expected [root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]# oc -n openshift-storage rsh rook-ceph-tools-76cf7cf54c-pwdrd sh-5.1$ sh-5.1$ sh-5.1$ ceph config dump WHO MASK LEVEL OPTION VALUE RO global basic log_to_file true global advanced mon_allow_pool_delete true global advanced mon_allow_pool_size_one true global advanced mon_cluster_log_file global advanced mon_pg_warn_min_per_osd 0 global advanced rbd_default_map_options ms_mode=prefer-crc * mon advanced auth_allow_insecure_global_id_reclaim false mgr advanced mgr/balancer/mode upmap mgr advanced mgr/prometheus/rbd_stats_pools ocs-storagecluster-cephblockpool * mds.ocs-storagecluster-cephfilesystem-a basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-a basic mds_join_fs ocs-storagecluster-cephfilesystem mds.ocs-storagecluster-cephfilesystem-b basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-b basic mds_join_fs ocs-storagecluster-cephfilesystem client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_enable_usage_log true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_nonexistent_bucket true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_object_name_utc true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zone ocs-storagecluster-cephobjectstore * client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zonegroup ocs-storagecluster-cephobjectstore * sh-5.1$ sh-5.1$ sh-5.1$ sh-5.1$ sh-5.1$ exit exit [root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]# [root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]# [root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]# oc patch storagecluster ocs-storagecluster -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/network/connections/encryption", "value": {"enabled": true} }]' storagecluster.ocs.openshift.io/ocs-storagecluster patched [root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]# [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc -n openshift-storage rsh rook-ceph-tools-76cf7cf54c-pwdrd sh-5.1$ ceph config dump WHO MASK LEVEL OPTION VALUE RO global basic log_to_file true global advanced mon_allow_pool_delete true global advanced mon_allow_pool_size_one true global advanced mon_cluster_log_file global advanced mon_pg_warn_min_per_osd 0 global basic ms_client_mode secure * global basic ms_cluster_mode secure * global basic ms_service_mode secure * global advanced rbd_default_map_options ms_mode=secure * mon advanced auth_allow_insecure_global_id_reclaim false mgr advanced mgr/balancer/mode upmap mgr advanced mgr/prometheus/rbd_stats_pools ocs-storagecluster-cephblockpool * mds.ocs-storagecluster-cephfilesystem-a basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-a basic mds_join_fs ocs-storagecluster-cephfilesystem mds.ocs-storagecluster-cephfilesystem-b basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-b basic mds_join_fs ocs-storagecluster-cephfilesystem client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_enable_usage_log true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_nonexistent_bucket true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_object_name_utc true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zone ocs-storagecluster-cephobjectstore * client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zonegroup ocs-storagecluster-cephobjectstore * sh-5.1$ sh-5.1$ sh-5.1$ sh-5.1$ sh-5.1$ exit exit [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc patch storagecluster ocs-storagecluster -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/network/connections/encryption", "value": {"enabled": false} }]' storagecluster.ocs.openshift.io/ocs-storagecluster patched [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc get storagecluster -n openshift-storage NAME AGE PHASE EXTERNAL CREATED AT VERSION ocs-storagecluster 3h14m Ready 2023-05-10T11:27:14Z 4.13.0 [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc -n openshift-storage rsh rook-ceph-tools-76cf7cf54c-pwdrd sh-5.1$ sh-5.1$ ceph config dump WHO MASK LEVEL OPTION VALUE RO global basic log_to_file true global advanced mon_allow_pool_delete true global advanced mon_allow_pool_size_one true global advanced mon_cluster_log_file global advanced mon_pg_warn_min_per_osd 0 global advanced rbd_default_map_options ms_mode=prefer-crc * mon advanced auth_allow_insecure_global_id_reclaim false mgr advanced mgr/balancer/mode upmap mgr advanced mgr/prometheus/rbd_stats_pools ocs-storagecluster-cephblockpool * mds.ocs-storagecluster-cephfilesystem-a basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-a basic mds_join_fs ocs-storagecluster-cephfilesystem mds.ocs-storagecluster-cephfilesystem-b basic mds_cache_memory_limit 4294967296 mds.ocs-storagecluster-cephfilesystem-b basic mds_join_fs ocs-storagecluster-cephfilesystem client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_enable_usage_log true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_nonexistent_bucket true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_log_object_name_utc true client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zone ocs-storagecluster-cephobjectstore * client.rgw.ocs.storagecluster.cephobjectstore.a advanced rgw_zonegroup ocs-storagecluster-cephobjectstore * sh-5.1$ exit exit [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc get csv mcg-operator.v4.13.0-186.stable -n openshift-storage -o yaml |grep full_version full_version: 4.13.0-186 [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc version Client Version: 4.13.0-0.nightly-ppc64le-2023-05-09-232622 Kustomize Version: v4.5.7 Server Version: 4.13.0-0.nightly-ppc64le-2023-05-09-232622 Kubernetes Version: v1.26.3+b404935 [root@rdr-cicd-odf-3396-syd05-bastion-0 ~]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:3742