Bug 1893626

Summary: Failed to encrypt OSDs on OCS4.6 installation (via UI)
Product: OpenShift Container Platform Reporter: Oded <oviner>
Component: Console Storage PluginAssignee: Bipul Adhikari <badhikar>
Status: CLOSED ERRATA QA Contact: Oded <oviner>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.6CC: aos-bugs, badhikar, ebenahar, jefbrown, jelopez, madam, nberry, nthomas, ocs-bugs, prsurve, sostapov
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1894210 (view as bug list) Environment:
Last Closed: 2020-11-16 14:37:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1894210    
Bug Blocks:    
Attachments:
Description Flags
deploy olm file none

Description Oded 2020-11-02 08:23:10 UTC
Created attachment 1725706 [details]
deploy olm file

Description of problem (please be detailed as possible and provide log
snippests):
Failed to encrypt OSDs on OCS4.6 installation (via UI)

Version of all relevant components (if applicable):
Provider: Vmware
OCP Version: 4.6.0-0.nightly-2020-10-31-214252

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?
yes

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:

1.Verify OCP status:[PASS]

2.Deploy OLM(On OCP4.6, OCS Operator won't be shown in the operator hub without this command)
image: quay.io/rhceph-dev/ocs-registry:4.6.0-149.ci
$ oc create -f deploy_olm_install.yaml

3.Install OCS4.6
Set the toggle to Enabled to enable data encryption on the cluster.

https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.6/html-single/deploying_openshift_container_storage_on_vmware_vsphere/index?lb_target=stage

4.Verify OCS Installation [pass]

5.Verify OSDs are Encrypted:[Failed]
a.Get node where the OSD runs
$ oc get pods -n openshift-storage -o wide | grep -i osd
b.Go to Node and run "lsblk" command
$ oc debug node/compute-0
sh-4.2# chroot /host /bin/bash
[root@compute-0 /]# lsblk

Note:
When using this script for installation,the OSDs are encrypted.
https://github.com/red-hat-storage/ocs-ci/blob/master/conf/ocsci/encryption_at_rest.yaml



Actual results:
[root@compute-0 /]# lsblk
NAME                        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0                         7:0    0   512G  0 loop 
sda                           8:0    0   120G  0 disk 
|-sda1                        8:1    0   384M  0 part /boot
|-sda2                        8:2    0   127M  0 part /boot/efi
|-sda3                        8:3    0     1M  0 part 
`-sda4                        8:4    0 119.5G  0 part 
  `-coreos-luks-root-nocrypt
                            253:0    0 119.5G  0 dm   /sysroot
sdb                           8:16   0    10G  0 disk /var/lib/kubelet/pods/37e8f110-6922-468d-b739-543118c523ee/volumes/kubernetes.io~vsphere-volume/pvc-efb9e421-87fc-4125-a19b-2bd5411045f2
sdc                           8:32   0   512G  0 disk 
rbd0                        252:0    0    50G  0 disk /var/lib/kubelet/pods/3bc1614e-f4ff-40f6-97d5-458d686331fd/volumes/kubernetes.io~csi/pvc-db9f3217-30e0-4992-8d99-52e2db99508c/mount

Expected results:
[root@compute-0 /]# lsblk
NAME                       MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0                        7:0    0   256G  0 loop  
sda                          8:0    0   120G  0 disk  
|-sda1                       8:1    0   384M  0 part  /boot
|-sda2                       8:2    0   127M  0 part  /boot/efi
|-sda3                       8:3    0     1M  0 part  
`-sda4                       8:4    0 119.5G  0 part  
  `-coreos-luks-root-nocrypt
                           253:0    0 119.5G  0 dm    /sysroot
sdb                          8:16   0    10G  0 disk  /var/lib/kubelet/pods/1ea7b3c0-ffde-4068-959d-1d8ba20030ca/volumes/kubernetes.io~vsphere-volume/pvc-8fe02cf8-5a3f-47e3-9373-a773a2a5966e
sdc                          8:32   0   256G  0 disk  
`-ocs-deviceset-0-data-0-gmxhm-block-dmcrypt
                           253:1    0   256G  0 crypt 
rbd0                       252:0    0    40G  0 disk  /var/lib/kubelet/pods/5d0a1b12-6686-4b2a-97bd-b9ca140190c6/volumes/kubernetes.io~csi/pvc-ce5d64ad-21bf-4ba4-b399-3a2b5da31594/mount
rbd1                       252:16   0    40G  0 disk  /var/lib/kubelet/pods/634e50a9-0eb6-4cfa-ac5a-369bcdb02487/volumes/kubernetes.io~csi/pvc-57f369aa-da05-41e4-b9a9-0fb972928d12/mount

Additional info:

Comment 2 Neha Berry 2020-11-02 08:45:55 UTC
Proposing as a blocker as this needs an investigation to root cause. The cluster is still UP for troubleshooting. Please let us know

Comment 5 Michael Adam 2020-11-03 10:36:34 UTC
(In reply to Oded from comment #3)
> Logs: http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1893626/

Looking at the logs, it does not seem that the UI has set the Encryption flag in the storagecluster yaml:

http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1893626/must-gather.local.6710228411716741450/quay-io-rhceph-dev-ocs-must-gather-sha256-9bac3a455e1d0f8fe880798e00f9b970068db95e938b651a2d4521fee7e51fb8/namespaces/openshift-storage/oc_output/storagecluster

(search for "Encryption:" - it's empty should be `Enable: true` if enabled)

So it seems to be either a user error or a bug in the UI.
But since it worked from the UI in another case, I think the UI folks need to look.

Comment 7 Oded 2020-11-03 12:27:23 UTC
@madam
I created a doc that describe my test procedure
https://docs.google.com/document/d/1-E9wKQ599PrUPq7y1BbuVWIBpcGlYWgsS2vsMzLij7k/edit

Comment 8 Neha Berry 2020-11-03 14:35:55 UTC
(In reply to Oded from comment #7)
> @madam
> I created a doc that describe my test procedure
> https://docs.google.com/document/d/1-
> E9wKQ599PrUPq7y1BbuVWIBpcGlYWgsS2vsMzLij7k/edit

If we check the steps and screenshot which this doc contains, it doesn't seem like Oded missed anything

Hence, could it be that the encryption toggle button did NOT really work for the Internal cluster but worked for Internal Attached. It would help to take a look.

In the meanwhile, we will try to test the same again.

Thanks Oded for documenting your steps.. really helpful

Comment 9 Bipul Adhikari 2020-11-03 17:11:07 UTC
Found the issue, there's an issue in the UI.

Comment 10 Michael Adam 2020-11-03 18:28:09 UTC
(In reply to Bipul Adhikari from comment #9)
> Found the issue, there's an issue in the UI.

Thanks Bipul!

But we should not just change the product to OCP, since we loose tracking from OCS this way.
Instead we should clone it into OCP keeping the tracking bug in OCS.

Comment 13 Oded 2020-11-09 20:55:56 UTC
Bug Fixed

Provider: Vmware
OCP Version:4.6.0-0.nightly-2020-11-07-035509

Test Process:
1.Install OCS Operator (ocs-operator.v4.6.0-156.ci) via UI
https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.6/html-single/deploying_openshift_container_storage_on_vmware_vsphere/index?lb_target=stage

2.Check all pods in openshift-storage name-space

3.Check Ceph health
sh-4.4# ceph health
HEALTH_OK

4.Get clusterserviceversions
$ oc get clusterserviceversions -n openshift-storage
NAME                         DISPLAY                       VERSION        REPLACES   PHASE
ocs-operator.v4.6.0-156.ci   OpenShift Container Storage   4.6.0-156.ci              Succeeded

5.Verify OSD encrypted:
[root@compute-0 /]# lsblk
NAME      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop1       7:1    0   512G  0 loop  
sda         8:0    0   120G  0 disk  
|-sda1      8:1    0   384M  0 part  /boot
|-sda2      8:2    0   127M  0 part  /boot/efi
|-sda3      8:3    0     1M  0 part  
`-sda4      8:4    0 119.5G  0 part  
  `-coreos-luks-root-nocrypt
          253:0    0 119.5G  0 dm    /sysroot
sdb         8:16   0    10G  0 disk  /var/lib/kubelet/pods/e5f97334-d7ae-4b19-ac05-f6e6e7d6546a/volumes/kubernetes.io~vsphere-volume/pvc-6efc4210-1468-4491-8f03-0dd2b16b4826
sdc         8:32   0   512G  0 disk  
`-ocs-deviceset-thin-0-data-0-882rx-block-dmcrypt
          253:1    0   512G  0 crypt 


[root@compute-1 /]# lsblk
NAME     MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0      7:0    0   512G  0 loop  
sda        8:0    0   120G  0 disk  
|-sda1     8:1    0   384M  0 part  /boot
|-sda2     8:2    0   127M  0 part  /boot/efi
|-sda3     8:3    0     1M  0 part  
`-sda4     8:4    0 119.5G  0 part  
  `-coreos-luks-root-nocrypt
         253:0    0 119.5G  0 dm    /sysroot
sdb        8:16   0    10G  0 disk  /var/lib/kubelet/pods/250f98fe-7a37-4eed-b2b9-339f6809d749/volumes/kubernetes.io~vsphere-volume/pvc-0f1d0694-f8ae-46d0-85bf-57f5fa40de8e
sdc        8:32   0   512G  0 disk  
`-ocs-deviceset-thin-1-data-0-7v8nd-block-dmcrypt
         253:1    0   512G  0 crypt 


[root@compute-2 /]# lsblk
NAME      MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
loop0       7:0    0   512G  0 loop  
sda         8:0    0   120G  0 disk  
|-sda1      8:1    0   384M  0 part  /boot
|-sda2      8:2    0   127M  0 part  /boot/efi
|-sda3      8:3    0     1M  0 part  
`-sda4      8:4    0 119.5G  0 part  
  `-coreos-luks-root-nocrypt
          253:0    0 119.5G  0 dm    /sysroot
sdb         8:16   0    10G  0 disk  /var/lib/kubelet/pods/1cefae24-89cd-4aeb-a44b-93685f61402b/volumes/kubernetes.io~vsphere-volume/pvc-9d8afb2e-a884-49f2-ae2b-2c1faaa02118
sdc         8:32   0   512G  0 disk  
`-ocs-deviceset-thin-2-data-0-bfcmm-block-dmcrypt
          253:1    0   512G  0 crypt 
rbd0      252:0    0    50G  0 disk  /var/lib/kubelet/pods/8485563d-6359-4d2e-b308-1efb39ab3cfc/volumes/kubernetes.io~csi/pvc-b8a87734-dd0d-4df6-81cf-dcfb6030c909/mount

Comment 15 errata-xmlrpc 2020-11-16 14:37:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.4 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4987

Comment 16 Bipul Adhikari 2020-12-02 05:53:03 UTC
*** Bug 1903413 has been marked as a duplicate of this bug. ***