Bug 2192088 - [IBM P] rbd_default_map_options value not set to ms_mode=secure in in-transit encryption enabled ODF cluster
Summary: [IBM P] rbd_default_map_options value not set to ms_mode=secure in in-transit...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: rook
Version: 4.13
Hardware: ppc64le
OS: Linux
unspecified
medium
Target Milestone: ---
: ODF 4.13.0
Assignee: Travis Nielsen
QA Contact: Neha Berry
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-29 11:20 UTC by Sudeesh John
Modified: 2023-08-09 17:03 UTC (History)
4 users (show)

Fixed In Version: 4.13.0-184
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-06-21 15:25:28 UTC
Embargoed:
mrajanna: needinfo-


Attachments (Terms of Use)
must gather logs (8.13 MB, application/gzip)
2023-04-29 11:20 UTC, Sudeesh John
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage rook pull 488 0 None open BUG 2192088: unset all the encryption configuration before setting 2023-05-04 07:08:41 UTC
Github rook rook pull 12181 0 None open ceph: unset all the configuration before setting 2023-05-03 11:09:00 UTC
Red Hat Product Errata RHBA-2023:3742 0 None None None 2023-06-21 15:25:44 UTC

Description Sudeesh John 2023-04-29 11:20:01 UTC
Created attachment 1960969 [details]
must gather logs

rbd_default_map_options is not set to ms_mode=secure in an in-transit encryption enabled ODF 4.13 cluster. in-transit encryption is enabled after deployment of ODF 4.13 cluster

Version of all relevant components (if applicable):

full_version: 4.13.0-179

[root@rdr-cicd-odf-60c6-bastion-0 4.13]# oc version
Client Version: 4.13.0-rc.2
Kustomize Version: v4.5.7
Server Version: 4.13.0-0.nightly-ppc64le-2023-04-28-143059
Kubernetes Version: v1.26.3+b404935
[root@rdr-cicd-odf-60c6-bastion-0 4.13]#


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?

No


Is there any workaround available to the best of your knowledge?

No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

1


Can this issue reproducible? Yes


Can this issue reproduce from the UI?

No


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy ODF 4.13 via UI (without enabling in-transit encryption)
2. Enable in-transit encryption from cli

oc patch storagecluster ocs-storagecluster -n openshift-storage --type json --patch  '[{ "op": "replace", "path": "/spec/network/connections/encryption", "value": {"enabled": true} }]'

3. Check Ceph config

a. oc patch OCSInitialization ocsinit -n openshift-storage --type json --patch '[{ "op": "replace", "path": "/spec/enableCephTools", "value": true }]'
b. oc rsh  -n openshift-storage rook-ceph-tools-7c5884d455-lzgnp
c. sh-5.1$ ceph config dump
WHO                                              MASK  LEVEL     OPTION                                 VALUE                               RO
global                                                 basic     log_to_file                            true
global                                                 advanced  mon_allow_pool_delete                  true
global                                                 advanced  mon_allow_pool_size_one                true
global                                                 advanced  mon_cluster_log_file
global                                                 advanced  mon_pg_warn_min_per_osd                0
global                                                 basic     ms_client_mode                         secure                              *
global                                                 basic     ms_cluster_mode                        secure                              *
global                                                 basic     ms_service_mode                        secure                              *
global                                                 advanced  rbd_default_map_options                ms_mode=prefer-crc                  *
mon                                                    advanced  auth_allow_insecure_global_id_reclaim  false
mgr                                                    advanced  mgr/balancer/mode                      upmap
mgr                                                    advanced  mgr/prometheus/rbd_stats_pools         ocs-storagecluster-cephblockpool    *
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_enable_usage_log                   true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_nonexistent_bucket             true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_object_name_utc                true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zone                               ocs-storagecluster-cephobjectstore  *
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zonegroup                          ocs-storagecluster-cephobjectstore  *
sh-5.1$
sh-5.1$
sh-5.1$


Actual results:
  rbd_default_map_options                ms_mode=prefer-crc

Expected results:
  rbd_default_map_options                ms_mode=secure

Additional info:

Comment 2 Travis Nielsen 2023-05-01 19:26:13 UTC
How soon did you check the ms_mode after updating the cluster to enable encryption? It can take a couple minutes for the Rook operator reconcile to apply the new ms_mode with encryption enabled. 

In the attached must gather, the rook operator log shows that the setting was applied successfully.

2023-04-29T10:27:19.516475943Z 2023-04-29 10:27:19.516450 I | op-config: applying ceph settings:
2023-04-29T10:27:19.516475943Z [global]
2023-04-29T10:27:19.516475943Z ms_cluster_mode         = secure
2023-04-29T10:27:19.516475943Z ms_service_mode         = secure
2023-04-29T10:27:19.516475943Z ms_client_mode          = secure
2023-04-29T10:27:19.516475943Z rbd_default_map_options = ms_mode=secure
2023-04-29T10:27:20.467038857Z 2023-04-29 10:27:20.466961 I | op-config: successfully applied settings to the mon configuration database


This appears to be the expected behavior. If you confirm there is not another issue, we can close the bug.

Comment 3 Sudeesh John 2023-05-02 06:56:13 UTC
I did check the Ceph config dump again , I still don't see ms_mode=secure

[root@rdr-cicd-odf-60c6-bastion-0 ~]# oc rsh  -n openshift-storage rook-ceph-tools-7c5884d455-lzgnp
sh-5.1$
sh-5.1$ history
    1  history
sh-5.1$ ceph config dump
WHO                                              MASK  LEVEL     OPTION                                 VALUE                               RO
global                                                 basic     log_to_file                            true
global                                                 advanced  mon_allow_pool_delete                  true
global                                                 advanced  mon_allow_pool_size_one                true
global                                                 advanced  mon_cluster_log_file
global                                                 advanced  mon_pg_warn_min_per_osd                0
global                                                 basic     ms_client_mode                         secure                              *
global                                                 basic     ms_cluster_mode                        secure                              *
global                                                 basic     ms_service_mode                        secure                              *
global                                                 advanced  rbd_default_map_options                ms_mode=prefer-crc                  *
mon                                                    advanced  auth_allow_insecure_global_id_reclaim  false
mgr                                                    advanced  mgr/balancer/mode                      upmap
mgr                                                    advanced  mgr/prometheus/rbd_stats_pools         ocs-storagecluster-cephblockpool    *
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_enable_usage_log                   true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_nonexistent_bucket             true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_object_name_utc                true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zone                               ocs-storagecluster-cephobjectstore  *
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zonegroup                          ocs-storagecluster-cephobjectstore  *
sh-5.1$



But, if I enable the encryption from UI during the storage cluster creation , the value is set to ms_mode=secure

Comment 4 Sudeesh John 2023-05-02 13:11:00 UTC
>> How soon did you check the ms_mode after updating the cluster to enable encryption?
btw, the above dump is after a day

Comment 5 Travis Nielsen 2023-05-02 14:11:17 UTC
Could I connect to this cluster if it's still up? The operator log shows that value was applied successfully, so I don't understand why the ceph config dump is not showing the option.

Comment 7 Travis Nielsen 2023-05-02 17:55:45 UTC
The assimilate-conf is silently failing to set the ms_mode. If I run this command manually in the toolbox:

sh-5.1$ ceph config assimilate-conf -i settings.txt -o out.txt
sh-5.1$ echo $?
0
sh-5.1$ cat out.txt 

[global]
	rbd_default_map_options = ms_mode=secure


The out.txt returns that the setting was invalid, even though the return value succeeded.
Still investigating why this is failing to be set...

Comment 8 Travis Nielsen 2023-05-02 19:53:43 UTC
Madhu, could you take a look at how the rbd_default_map_options needs to be applied? Details for the failure are in the previous comment.

This command succeeds:
  ceph config set client.admin rbd_default_map_options ms_mode=secure

However, this isn't seem correct if we want all clients to be in secure mode, not just the admin client.

I not able to repro this issue in minikube with latest upstream rook and ceph 17.2.6. Perhaps there was an issue with the assimilate-conf
in the ceph release being used? For now, perhaps we need to set this separately from assimilate-conf.

I'm proposing this as a 4.13 blocker since without it we don't have encryption on the wire.

Comment 14 Sudeesh John 2023-05-08 13:34:33 UTC
When can we expect a build with the fix ?

Comment 16 Sudeesh John 2023-05-10 14:49:36 UTC
Verified on build 4.13.0-186, working as expected

[root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]# oc -n openshift-storage rsh  rook-ceph-tools-76cf7cf54c-pwdrd
sh-5.1$
sh-5.1$
sh-5.1$ ceph config dump
WHO                                              MASK  LEVEL     OPTION                                 VALUE                               RO
global                                                 basic     log_to_file                            true
global                                                 advanced  mon_allow_pool_delete                  true
global                                                 advanced  mon_allow_pool_size_one                true
global                                                 advanced  mon_cluster_log_file
global                                                 advanced  mon_pg_warn_min_per_osd                0
global                                                 advanced  rbd_default_map_options                ms_mode=prefer-crc                  *
mon                                                    advanced  auth_allow_insecure_global_id_reclaim  false
mgr                                                    advanced  mgr/balancer/mode                      upmap
mgr                                                    advanced  mgr/prometheus/rbd_stats_pools         ocs-storagecluster-cephblockpool    *
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_enable_usage_log                   true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_nonexistent_bucket             true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_object_name_utc                true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zone                               ocs-storagecluster-cephobjectstore  *
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zonegroup                          ocs-storagecluster-cephobjectstore  *
sh-5.1$
sh-5.1$
sh-5.1$
sh-5.1$
sh-5.1$ exit
exit
[root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]#
[root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]#
[root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]#  oc patch storagecluster ocs-storagecluster -n openshift-storage --type json --patch  '[{ "op": "replace", "path": "/spec/network/connections/encryption", "value": {"enabled": true} }]'
storagecluster.ocs.openshift.io/ocs-storagecluster patched
[root@rdr-cicd-odf-3396-syd05-bastion-0 4.13]# 

[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]#
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc -n openshift-storage rsh  rook-ceph-tools-76cf7cf54c-pwdrd
sh-5.1$ ceph config dump
WHO                                              MASK  LEVEL     OPTION                                 VALUE                               RO
global                                                 basic     log_to_file                            true
global                                                 advanced  mon_allow_pool_delete                  true
global                                                 advanced  mon_allow_pool_size_one                true
global                                                 advanced  mon_cluster_log_file
global                                                 advanced  mon_pg_warn_min_per_osd                0
global                                                 basic     ms_client_mode                         secure                              *
global                                                 basic     ms_cluster_mode                        secure                              *
global                                                 basic     ms_service_mode                        secure                              *
global                                                 advanced  rbd_default_map_options                ms_mode=secure                      *
mon                                                    advanced  auth_allow_insecure_global_id_reclaim  false
mgr                                                    advanced  mgr/balancer/mode                      upmap
mgr                                                    advanced  mgr/prometheus/rbd_stats_pools         ocs-storagecluster-cephblockpool    *
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_enable_usage_log                   true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_nonexistent_bucket             true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_object_name_utc                true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zone                               ocs-storagecluster-cephobjectstore  *
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zonegroup                          ocs-storagecluster-cephobjectstore  *
sh-5.1$
sh-5.1$
sh-5.1$
sh-5.1$
sh-5.1$ exit
exit
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]#  oc patch storagecluster ocs-storagecluster -n openshift-storage --type json --patch  '[{ "op": "replace", "path": "/spec/network/connections/encryption", "value": {"enabled": false} }]'
storagecluster.ocs.openshift.io/ocs-storagecluster patched
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]#
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]#
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]#
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc get storagecluster -n openshift-storage
NAME                 AGE     PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   3h14m   Ready              2023-05-10T11:27:14Z   4.13.0
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc -n openshift-storage rsh  rook-ceph-tools-76cf7cf54c-pwdrd
sh-5.1$
sh-5.1$ ceph config dump
WHO                                              MASK  LEVEL     OPTION                                 VALUE                               RO
global                                                 basic     log_to_file                            true
global                                                 advanced  mon_allow_pool_delete                  true
global                                                 advanced  mon_allow_pool_size_one                true
global                                                 advanced  mon_cluster_log_file
global                                                 advanced  mon_pg_warn_min_per_osd                0
global                                                 advanced  rbd_default_map_options                ms_mode=prefer-crc                  *
mon                                                    advanced  auth_allow_insecure_global_id_reclaim  false
mgr                                                    advanced  mgr/balancer/mode                      upmap
mgr                                                    advanced  mgr/prometheus/rbd_stats_pools         ocs-storagecluster-cephblockpool    *
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-a                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_cache_memory_limit                 4294967296
mds.ocs-storagecluster-cephfilesystem-b                basic     mds_join_fs                            ocs-storagecluster-cephfilesystem
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_enable_usage_log                   true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_nonexistent_bucket             true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_log_object_name_utc                true
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zone                               ocs-storagecluster-cephobjectstore  *
client.rgw.ocs.storagecluster.cephobjectstore.a        advanced  rgw_zonegroup                          ocs-storagecluster-cephobjectstore  *
sh-5.1$ exit
exit
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]#

root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc get csv mcg-operator.v4.13.0-186.stable -n openshift-storage -o yaml |grep full_version
    full_version: 4.13.0-186
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]#
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]# oc version
Client Version: 4.13.0-0.nightly-ppc64le-2023-05-09-232622
Kustomize Version: v4.5.7
Server Version: 4.13.0-0.nightly-ppc64le-2023-05-09-232622
Kubernetes Version: v1.26.3+b404935
[root@rdr-cicd-odf-3396-syd05-bastion-0 ~]#

Comment 19 errata-xmlrpc 2023-06-21 15:25:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:3742


Note You need to log in before you can comment on or make changes to this bug.